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Preface 


Richard Courant’s Differential and Integral Calculus, Vols. I and 
II, has been tremendously successful in introducing several gener- 
ations of mathematicians to higher mathematics. Throughout, those 
volumes presented the important lesson that meaningful mathematics 
is created from a union of intuitive imagination and deductive reason- 
ing. In preparing this revision the authors have endeavored to main- 
tain the healthy balance between these two modes of thinking which 
characterized the original work. Although Richard Courant did not 
live to see the publication of this revision of Volume II, all major 
changes had been agreed upon and drafted by the authors before Dr. 
Courant’s death in January 1972. 

From the outset, the authors realized that Volume I, which deals 
with functions of several variables, would have to be revised more 
drastically than Volume I. In particular, it seemed desirable to treat 
the fundamental theorems on integration in higher dimensions with 
the same degree of rigor and generality applied to integration in one 
dimension. In addition, there were a number of new concepts and 
topics of basic importance, which, in the opinion of the authors, belong 
to an introduction to analysis. 

Only minor changes were made in the short chapters (6, 7, and 8) 
dealing, respectively, with Differential Equations, Calculus of Vari- 
ations, and Functions of a Complex Variable. In the core of the book, 
Chapters 1-5, we retained as much as possible the original scheme of 
two roughly parallel developments of each subject at different levels: 
an informal introduction based on more intuitive arguments together 
with a discussion of applications laying the groundwork for the 
subsequent rigorous proofs. 

The material from linear algebra contained in the original Chapter 
1 seemed inadequate as a foundation for the expanded calculus struc- 
ture. Thus, this chapter (now Chapter 2) was completely rewritten and 
now presents all the required properties of nth order determinants and 
matrices, multilinear forms, Gram determinants, and linear manifolds. 


v 


vi Preface 


The new Chapter 1 contains all the fundamental properties of 
linear differential forms and their integrals. These prepare the reader 
for the introduction to higher-order exterior differential forms added 
to Chapter 3. Also found now in Chapter 3 are a new proof of the 
implicit function theorem by successive approximations and a discus- 
sion of numbers of critical points and of indices of vector fields in two 
dimensions. 

Extensive additions were made to the fundamental properties of 
multiple integrals in Chapters 4 and 5. Here one is faced with a familiar 
difficulty: integrals over a manifold M, defined easily enough by 
subdividing M into convenient pieces, must be shown to be inde- 
pendent of the particular subdivision. This is resolved by the sys- 
tematic use of the family of Jordan measurable sets with its finite 
intersection property and of partitions of unity. In order to minimize 
topological complications, only manifolds imbedded smoothly into 
Euclidean space are considered. The notion of “orientation” of a 
manifold is studied in the detail needed for the discussion of integrals 
of exterior differential forms and of their additivity properties. On this 
basis, proofs are given for the divergence theorem and for Stokes’s 
theorem in n dimensions. To the section on Fourier integrals in 
Chapter 4 there has been added a discussion of Parseval’s identity and 
of multiple Fourier integrals. 

Invaluable in the preparation of this book was the continued 
generous help extended by two friends of the authors, Professors 
Albert A. Blank of Carnegie-Mellon University, and Alan Solomon 
of the University of the Negev. Almost every page bears the imprint 
of their criticisms, corrections, and suggestions. In addition, they 
prepared the problems and exercises for this volume.! 

Thanks are due also to our colleagues, Professors K. O. Friedrichs 
and Donald Ludwig for constructive and valuable suggestions, and to 
John Wiley and Sons and their editorial staff for their continuing 
encouragement and assistance. 


FRITZ JOHN 


NewYork 
September 1973 


1In contrast to Volume I, these have been incorporated completely into the text; 
their solutions can be found at the end of the volume. 
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Introduction to Calculus and Analysis 
Volume II 


CHAPTER 
1 


Functions of Several 
Variables and Their Derivatives 


The concepts of limit, continuity, derivative, and integral, as 
developed in Volume I, are also basic in two or more independent 
variables. However, in higher dimensions many new phenomena, 
which have no counterpart at all in the theory of functions of a single 
variable, must be dealt with. As a rule, a theorem that can be proved 
for functions of two variables may be extended easily to functions of 
more than two variables without any essential change in the proof. 
In what follows, therefore, we often confine ourselves to functions of 
two variables, where relations are much more easily visualized 
geometrically, and discuss functions of three or more variables only 
when some additional insight is gained thereby; this also permits 
simpler geometrical interpretations of our results. 


1.1 Points and Point Sets in the Plane and in Space 


a. Sequences of Points: Convergence 


An ordered pair of values (x, y) can be represented geometrically 
by the point P having x and y as coordinates in some Cartesian coor- 
dinate system. The distance between two points P = (x, y) and P’ = 
(x’, y’) is given by the formula 


PP’ = V(x’ — x)? + O — yE, 


which is basic for euclidean geometry. We use the notion of distance 
to define the neighborhoods of a point. The ¢-neighborhood of a point 


1 
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C = (a, B) consists of all the points P = (x, y) whose distance from 
C is less than £; geometrically this is the circular disk! of center C 
and radius £ that is described by the inequality 


(x — a)? + (y — B)? < e?. 
We shall consider infinite sequences of points 
Pı = (xı, yı), P2 = (x2, y2), Ea Pn = (Xn, Yn), o. 


For example, Pn = (n, n?) defines a sequence all of whose points lie 
on the parabola y = x?. The points in a sequence do not all have to be 
distinct. For example, the infinite sequence Pn = (2, (-1)”) has only 
two distinct elements. 

The sequence Pı, Ps, . . . is bounded if a disk can be found con- 
taining all of the Pn, that is, if there is a point Q and a number M 
such that PaQ < M for all n. Thus the sequence Pn =(1/n, 1/n2) is 
bounded, and the sequence (n, n?), unbounded. 

The most important concept associated with sequences is that of 
convergence. We say that a sequence of points Pi, Pe, . . . converges 
to a point Q, or that 


n»o 


if the distances PnQ converge to 0. Thus, lim Pa = Q means that for 


every € > 0 there exists a number N such that Pn lies in the ¢-neigh- 
borhood of Q for all n > N.? 
For example, for the sequence of points defined by Pn = (e-”/4 cos n, 
e-"/4 gin n), we have lim Pn = (0, 0) = Q, since here 
nr 


PnrQ = end — () for n— œ » 


We note that the Pn approach the origin Q along the logarithmic 
spiral with equation r = e~®/* in polar coordinates r, 9 (see Fig. 1.1). 
Convergence of the sequence of points Pn = (xn, yn) to the point 


1The word “circle,” as used ordinarily, is ambiguous, referring either to a curve or 
to the region bounded by it. We shall follow the current practice of reserving the 
term “circle” for the curve only, and the term “circular region” or “disk” for the 
two-dimensional region. Similarly, in space we distinguish the “sphere” (i.e., the 
spherical surface) from the solid three-dimensional “ball” that it bounds. 
2Equivalently, any disk with center Q contains all but a finite number of the Pn. 
The notation Pn —> Q for n —> œ will also be used. 
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Figure 1.1 Converging sequence Pn. 


Q = (a, b) means that the two sequences of numbers xn and yn con- 
verge separately and that 


lim xn = a, lim yn = b. 

nwo n>% 
Indeed, smallness of Pn@ implies that both xn — a and yn — b are 
small, since |xn — a| < PnQ, |yn — b| < PnQ; conversely, 


PnQ = V (xn — a)? + (yn — b} < |xn — a| + lyn — b|, 


so that Pxn@—— 0 when both xn —> a and yn —— b. 

Just as in the case of sequences of numbers, we can prove that a 
sequence of points converges, without knowing the limit, using 
Cauchy’s intrinsic convergence test. In two dimensions this asserts: 
For the convergence of a sequence of points Pn = (xn, yn) it is neces- 
sary and sufficient that for every € > 0 the inequality PuPm < € 
holds for all n, m exceeding a suitable value N = Ne). The proof 
follows immediately by applying the Cauchy test for sequences of 
numbers to each of the sequences xn and yn. 


b. Sets of Points in the Plane 


In the study of functions of a single variable x we generally per- 
mitted x to vary over an “interval,” which could be either closed or 
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open, bounded or unbounded. As possible domains of functions in 
higher dimensions, a greater variety of sets has to be considered and 
terms have to be introduced describing the simplest properties of such 
sets. In the plane we shall usually consider either curves or two- 
dimensional regions. Plane curves have been discussed extensively 
in Volume I (Chapter 4). Ordinarily they are given either “non- 
parametrically” in the form y = f(x) or ‘“parametrically” by a pair of 
functions x = g(t), y = y(ù, or “implicitly” by an equation F(x, y) 
= 0 (we shall say more about implicit representations in Chapter 3). 

In addition to curves, we have two-dimensional sets of points, 
forming a region. A region may be the entire xy-plane or a portion of 
the plane bounded by a simple closed curve (in this case forming a 
simply connected region as shown in Fig. 1.2) or by several such 
curves. In the last case it is said to be a multiply connected region, 
the number of boundary curves giving the so-called connectivity; Fig. 
1.3, for example, shows a triply connected region. A plane set may not 
be connected! at all, consisting of several separate portions (Fig. 1.4). 


Figure 1.2 A simply connected region. Figure 1.3 A triply connected region. 


Figure 1.4 A nonconnected region R. 


1For a precise definition of “connected,” see p. 102. 
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Ordinarily the boundary curves of the regions to be considered are 
sectionally smooth. That is, every such curve consists of a finite 
number of arcs, each of which has a continuously turning tangent 
at all of its points, including the end points. Such curves, therefore, 
can have at most a finite number of corners. 

In most cases we shall describe a region by one or more inequali- 
ties, the equal sign holding on some portion of the boundary. The two 
most important types of regions, which recur again and again, are the 
rectangular regions (with sides parallel to the coordinate axes) and 
the circular disks. A rectangular region (Fig. 1.5) consists of the 
points (x, y) whose coordinates satisfy inequalities of the form 


a<x< b, c<y<d; 


each coordinate is restricted to a definite interval, and the point 
(x, y) varies over the interior of a rectangle. As defined here, our 
rectangular region is open; that is, it does not contain its boundary. 


Figure 1.5 A rectangular region. 


The boundary curves are obtained by replacing one or more of the 
inequalities defining the region by equality and permitting (but not 
requiring) the equal sign in the others. For example, | 


x= @, csysd 


defines one of the sides of the rectangle. The closed rectangle ob- 
tained by adding all the boundary points to the set is described by the 
inequalities 

asxsb csSyd. 


The circular disk with center (a, B) and radius r (Fig. 1.6) is, as 
seen before, given by the inequality 
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Figure 1.6 A circular disk. 


(x — a)? + (y — B) <r’. 


Adding the boundary circle to this “open” disk, we obtain the “closed 
disk” described by 


(x — a)? + (y — BP < r°. 


c. The Boundary of a Set. Closed and Open Sets 


One might think of the boundary of a region as a kind of membrane 
separating the points belonging to the region from those that do not 
belong. As we shall see, this intuitive notion of boundary would not 
always have a meaning. It is remarkable, however, that there is a 
way to define quite generally the boundary of any point set whatsoever 
in a way which is, at least, consistent with our intuitive notion. We 
say that a point P is a boundary point of a set S of points if every 
neighborhood of P contains both points belonging to S and points not 
belonging to S. Consequently, if P is not a boundary point, there 
exists a neighborhood of P that contains only one kind of point; that 
is, we either can find a neighborhood of P that consists entirely of 
points of S, in which case we call P an interior point of S, or 
we can find a neighborhood of P entirely free of points of S, in 
which case we call P an exterior point of S. Thus, for a given set S of 
points, every point in the plane is either boundary point or interior 
point or exterior point of S and belongs to only one of these classes. 
The set of boundary points of S forms the boundary of S, denoted 
by the symbol @S. | 

For example, let S be the rectangular region 


a<x< b, c<y<d. 
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Obviously, we can find for any point P of S a small circular disk with 
center P = (a, B) that is entirely contained in S; we only have to take 
an €-neighborhood of P in which e is positive and so small that 


a<a—e<a+e< hb, e<P-—exBPt+e<d. 


This shows that here every point of S is an interior point. The bound- 
ary points P of S are just the points lying either on one of the sides 
or at a corner of the rectangle; in the first case, one-half of every 
sufficiently small neighborhood of P will belong to S and one-half 
will not. In the second case, one-quarter of every neighborhood 
belongs to S and three-quarters do not (Fig. 1.7). 


Figure 1.7 Interior point A, exterior point D, 
boundary points B, C of rectangular region. 


By definition, every interior point P of set S is necessarily a point 
of S, for there is a neighborhood of P consisting entirely of points of 
S, and P belongs to that neighborhood. Similarly, any exterior point 
of S definitely does not belong to S. On the other hand, the boundary 
points of a set sometimes do, and sometimes do not belong to the set.! 
The open rectangle 


a<x< b, c<y<d 
does not contain its boundary points, while the closed rectangle 


a@asxsb cxySd 


does. 


lObserve the distinction between “not belonging to S” and “exterior to S.” A 
boundary point of S never is exterior, even when it does not belong to S. 
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Generally we call a set S of points open if no boundary point of S 
belongs to S (i.e., if S consists entirely of interior points). S is called 
closed if it contains its boundary. From any set S we can always 
obtain a closed set by adding to S all its boundary points, insofar 
as they do not belong to S already. We then obtain a new set, the 
closure S of S. The reader can easily verify that the closure of S is a 
closed set. The exterior points are exactly those that do not belong to 
the closure of S. Similarly, we define the interior S° of S as the 
set of interior points of S, that is, the set obtained by removing the 
boundary points from S. The interior of S is open. 

It should be observed that sets do not have to be either open or 
closed. We can easily construct a set S containing only part of its 
boundary, such as the semiopen rectangle 


asxx<b, csSy<d. 


It is also important to realize that our notion of boundary applies to 
quite general sets and furnishes results far removed from intuition. 
A prime example of a set that is in no sense a “curve” or a “region” 
is the set S consisting of the “rational points” of the plane, that is, 
of those points P = (x, y) for which both coordinates x and y are 
rational numbers. Clearly, every disk in the plane contains both ra- 
tional and nonrational points. Hence here there is no boundary 
“curve”; the boundary 0S consists of the whole plane. There exist 
neither interior nor exterior points. 

Even in cases where the boundary is one-dimensional, not all of 
it serves to separate interior from exterior points. For example, the 
inequalities | 


(x—a)?+(y—B)?<r*, yHB 


describe a disk with one diameter cut out; here the boundary con- 


Figure 1.8 Disk with diameter removed. 
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sists of the circle (x — a)? + (y — B) = r?, and of the diameter 
y = B, lxn-—al<r. 


Any sufficiently small neighborhood of a point of that diameter 
contains no exterior points at all (Fig. 1.8). 


d. Closure as Set of Limit Points 


The notions of “interior,” ‘boundary,’ and “exterior” of a set 
S are of importance when we consider limits of sequences of points 
Pi, Pe, ...all of which belong to the set S.1 Clearly, a point Q 
exterior to S cannot be the limit of the sequence, since there is a 
neighborhood of Q free of points of S, which prevents the Px from 
coming arbitrarily close to @. Hence, the limit of a sequence of points 
in S must either be a boundary point or an interior point of S. Since 
the interior and boundary points of S form the closure of S it follows 
that limits of sequences in S belong to the closure of S. 

Conversely, every point Q of the closure of S is actually the limit 
of some sequence Pı, Ps, . . . of points of S, for if Q is a point of the 
closure, then Q either belongs to S or to its boundary. In the first 
case we have trivially in Q, Q, Q,. ..a sequence of points of S 
converging to S. In the second case, for any £ > 0 the e-neighborhood 
of Q contains at least one point of S. For every natural number n we 
may choose a point Pn of S belonging to the s-neighborhood of Q 
with € = 1/n. Clearly, the Pn converge to Q. 


e. Points and Sets of Points in Space 


An ordered triple of numbers (x, y, 2) can be represented in the 
usual manner by a point P in space. Here the numbers x, y, z, the 
Cartesian coordinates of P, are the (signed) distances of P from three 
mutually perpendicular planes. The distance PP’ between the two 
points P = (x, y, z) and P’ = (x, y’, 2’) is given by 


PP’ = V(x — x)? + (y — y} + (Z — 2). 


The e-neighborhood of the point Q = (a, b, c) consists of the points 
P = (x, y, z) for which PQ < g; these points form the ball given by 
the inequality 


(x — a)? + (y — b} + (z — c} < &?. 


1The points Px do not have to be distinct from one another. 
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The analogues to the rectangular plane regions are the rectangular 
parallelepipeds! described by a system of inequalities of the form 


a<x< b, c<y<d, e<z<f. 


All the notions developed for plane sets—boundary, closure, and 
so on—carry over to sets in three dimensions in an obvious way. 

When we are dealing with ordered quadruples like x, y, z, w, our 
visual intuition fails to provide a geometrical interpretation. Still, 
it is convenient to make use of geometrical terminology, attributing 
to (x, y, z, w) a “point in four-dimensional space.” The quadruples 
(x, y, z, w) satisfying an inequality of the form 


(x — a)? + (y — b}? + (z — c}? + (w — d)? < £? 


constitute, by definition, the -neighborhood of the point (a, b, c, d). 
A rectangular region? is described by a system of inequalities of the 
form 


a<x< b, c<y<d, ex<z<f, g<w<h. 


Of course, there is nothing mysterious in this idea of “points” in 
four dimensions; it is just a convenient terminology and implies 
nothing about the physical reality of four-dimensional space. Indeed, 
nothing prevents us from calling an ‘‘n-tuple” (x1, . . . ,xn) a “point” 
in n-dimensional space, where n can be any natural number. For many 
applications it is quite useful and suggestive to represent a system 
described by n quantities in this way by a single point in some higher- 
dimensional space.? Often analogies with geometric interpretations 
in three-dimensional space provide guidance for operating in more 
than three dimensions. 


Exercises 1.1 


1. A point (x, y) of the plane may be represented by a complex number 
(Volume I, p. 103) in the form z = x + iy. Investigate the convergence 


1Parallel epipedon (Greek for “‘plane’’). 

2The terms “cell” and “interval” are also used to describe rectangular regions of 
this type in higher dimensions. 

3Thus the system of molecules of a gas in a container can be described by the position 
of a single point in a “phase-space” with a very high number of dimensions. Going 
even further, it is customary in some parts of analysis to represent an infinite 
sequence of numbers x1, x2, . . . by a point (xı, x2, . . .) in a space with infinitely 
many dimensions. 
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for different values of z of the sequences 
(a) 2” 
(b) z1” where z!/” is defined as the primitive nth root of z, that is, as the 
root with minimum positive amplitude. 
2. Prove for Pn = (xn + &n, Yn + Hn) that lim Pn=(x+& y+) 
where the limits x = lim xn, E = lim én, y= lim ya, 1) = lim Nn are 
noo noo > 


nro 
presumed to exist. 

3. Show that every point of the disk x? + y2 < 1 is an interior point. Is 
this also true for x? + y2? < 1? Explain. 

4. Show that the set S of points (x, y) with y > x? is open. 

5. What is the boundary of a line segment considered as a subset of the 
x, y-plane? 


Problems 1.1 


1. Let P be a boundary point of the set S that does not belong to S. Prove 
that there exists a sequence of distinct points Pi, Pe, . . . in S having P 
as limit. 

. Prove that the closure of a set is closed. 

3. Let P be any point of a set S, and let Q be any point outside the set. 

Prove that the line segment PQ contains a boundary point of S. 
4, Let G be the set of points (x, y) for which |x| < 1, |y] < 1/2 and for which 
y <0 if x = 1/2. Does G contain only interior points? Give evidence. 


bo 


1.2 Functions of Several Independent Variables 


a. Functions and Their Domains 


Equations of the form 
u =x +y, u = x?y?, or u = log(1 — x? — y?) 


assign a functional value u to a pair of values (x, y). In the first two 
of these examples, a value of u is assigned to every pair of values 
(x, y), while in the third the correspondence has a meaning only for 
those pairs of values (x, y) for which the inequality x? + y2 < 1 is true. 

In general, we say that u is a function of the independent variables 
x and y whenever some law f assigns a unique value of u, the depend- 
ent variable, to each pair of values (x, y) belonging to a certain spec- 
ified set, the domain of the function. A function u = f(x, y) thus 
defines a mapping of a set of points in the x, y-plane, the domain of 
f, onto a certain set of points on the u-axis, the range of f. Similarly, 
we say that u is a function of the n variables x1, x2,. . . , xn if for each 
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set of values (x1, . . . , Xn) belonging to a certain specified set there 
is assigned a corresponding unique value of u.! 

Thus, for example, the volume u = xyz of a rectangular paral- 
lelepiped is a function of the length of the three sides x, y, z; the 
magnetic declination is a function of the latitude, the longitude, and 
the time; the sum xı + x2 + + + + + xn is a function of the n terms 
Xi, X2,..., Xn. 

It is to be noted that the domain of a function f is an indispensable 
part of its description. In cases where u = f(x, y) is given by an 
explicit expression, it is natural to take as domain of f all (x, y) for 
which this expression makes sense. However, functions given by the 
same expression but having smaller domains can be defined by ‘“‘re- 
striction.” Thus the formula u = x? + y? can be used to define a func- 
tion with domain x? + y? < 1/2. 

Just as in the case of functions of one variable, a functional 
correspondence u = f(x, y) associates a unique value of u with the 
system of independent variables x, y. Thus, no functional value is 
assigned by an analytic expression that is multivalued, such as 
arc tan y/x, unless we specify, for example, that the “arc tangent” is to 
stand for the principal branch with values lying between —1/2 and 
+ n/2 (see Volume I, p. 214); in addition we have to exclude the line 
x = 0.? 


b. The Simplest Types of Functions 


Just as in the case of one independent variable, the simplest func- 
tions of more than one variable are the rational integral functions or 
polynomials. The most general polynomial of the first degree, or 
linear function, has the form 


= ax + by +c, 


where a, b, and c are constants. The general polynomial of the second 
degree has the form 


1Often we think of functions f as assigning a value to a point P rather than to the 
pair (x, y) of coordinates describing P. We write then f(P) for f(x, y). This notation is 
particularly useful when the functional relation between points P and values f(P) is 
defined geometrically without reference to a specific x, y-coordinate system. 
2Taking the principal value, we see that u = arc tan y/x for x > Ois nothing but the 
polar angle of the point (x, y) counted from the positive x-axis. This polar angle can 
still be defined geometrically in an obvious way as a univalued function with values 
between -r and x if we just exclude the origin and the points on the negative x-axis, 
but the polar angle is then no longer given by arc tan y/x in the extended region, if 
we understand the arc tangent to mean the principal branch. 
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u = ax? + bxy + cy? + dx + ey +f. 


Its domain is the whole x, y-plane. The general polynomial of any 
degree is a sum of a finite number of terms amnx™y” (called monomi- 
als), where m and n are nonnegative integers and the coefficients 
amn are arbitrary. 

The degree of the monomial amnx”y” is the sum m + n of the ex- 
ponents of x and y, provided the coefficient amn does not vanish. The 
degree of a polynomial is the highest degree of any monomial with 
nonvanishing coefficient (after combining terms with the same powers 
of x and y). A polynomial consisting of monomials all of which have 
the same degree N is called a homogeneous polynomial or a form of 
degree N. Thus x? + 2xy or 3x3 + (7/5) x2y + 2y3 are forms. 

By extracting roots of rational functions we obtain certain algebra- 
ic functions,! for example, 


_ u= |52 JEES; (x + y)? 
x + 5t 34 xy ° 
Most of the more complicated functions of several variables that 


we shall use here can be described in terms of the well-known func- 
tions of one variable, such as 


u = sin (x arc cos y) or u = logs y. 


c. Geometrical Representation of Functions 


Just as we represent functions of one variable by curves, we may 
represent functions of two variables geometrically by surfaces. To 
this end, we consider a rectangular x,y,u-coordinate system in 
space, and mark off above each point (x, y) of the domain R of the 
function in the x, y-plane the point P with the third coordinate u = 
f(x, y). As the point (x, y) ranges over the region R, the point P 
describes a surface in space. This surface we take as the geometrical 
representation of the function. 

Conversely, in analytical geometry, surfaces in space are rep- 
resented by functions of two variables, so that between such sur- 
faces and functions of two variables there is a reciprocal relation. 
For example, to the function 


lFor a general definition of the term “algebraic function,” see p. 229. 
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there corresponds the hemisphere lying above the x, y-plane, with 
unit radius and center at the origin. To the function u = x? + y? 
there corresponds a so-called paraboloid of revolution, obtained by 
rotating the parabola u = x? about the u-axis (Fig. 1.9). To the func- 
tions u = x? — y? and u = xy, there correspond hyperbolic parabo- 
loids (Fig. 1.10). The linear function u = ax + by + c has for its 
“graph”? a plane in space. If in the function u = f(x, y) one of the 
independent variables, say y, does not occur, so that u depends on 
x only, say u = g(x), the function is represented in x,y,u-space by a 
cylindrical surface generated by the perpendiculars to the u,x-plane 
at the points of the curve u = g(x). 


Figure 1.9 u = x? + y?. Figure 1.10 u =x? — y?. 


This representation by means of rectangular coordinates has, how- 
ever, two disadvantages. First, geometric visualization fails us when- 
ever we have to deal with three or more independent variables. 
Second, even for two independent variables it is often more con- 
venient to confine the discussion to the x, y-plane alone, since in the 
plane we can sketch and can perform geometrical constructions with- 
out difficulty. From this point of view, another geometrical represen- 
tation of a function of two variables, by means of contour lines, is 
sometimes preferable. In the x,y-plane we take all the points for 
which u = f(x, y) has a constant value, say u = k. These points will 
usually lie on a curve or curves, the so-called contour line, or level 
line, for the given constant value k of the function. We can also 
obtain these curves by cutting the surface u = f(x, y) by the 
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plane u = k parallel to the x, y-plane and projecting the curves of 
intersection perpendicularly onto the x, y-plane. 

The system of these contour lines, marked with the corresponding 
values ki, k2, . . . of the height k, gives us a representation of the 
function. In practice, k is assigned values in arithmetic progression, 
say k = vh, where v = 1, 2,. . . The distance between the contour 
lines then gives us a measure of the steepness of the surface u = 
f(x, y), for between every two neighboring lines the value of the 
function changes by the same amount. Where the contour lines are 
close together, the function rises or falls steeply; where the lines are 
far apart, the surface is flattish. This is the principle on which contour 
maps such as those of the U.S. Geological Survey are constructed. 

In this method the linear function u = ax + by + c is represented 
by a system of parallel straight lines ax + by + c = k. The function 
u = x? + y? is represented by a system of concentric circles (cf. Fig. 
1.11). The function u = x? — y?, whose surface is ‘“saddle-shaped”’ 
(Fig. 1.10), is represented by the system of hyperbolas shown in Fig. 
1.12. 


Figure 1.11 Contour lines of Figure 1.12 Contour lines of 
u = x? + y?, u = x? — y?, 


The method of representing the function u = f(x, y) by contour 
lines has the advantage of being capable of extension to functions of 
three independent variables. Instead of the contour lines we then have 
the level surfaces f(x, y, z) = k, where k is a constant to which we can 
assign any suitable sequence of values. For example, the level sur- 
faces for the function u = x? + y2? + 22 are spheres concentric about 
the origin of the x, y, z-coordinate system. 
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Exercises 1.2 


1. Evaluate the following functions at the points indicated: 


_ (are cot (x + y)\3 —14+v3 ._ 1-V3 
(a) 2 = (ER fan Ge) for x= Q °7 7 2 


(b) w = ecos z(z+y), for x=y= > z = —1 


(c) z = yz cos 24, y = e, y = logn 


(d) z = cosh (x + y), x = log m, y= log 5 
_x+y __ 1 —_ 1 
(e) z= 5» r= 5, ISF 


2. As in Volume I, unless we make an explicit exception, we consider the 
domain of a function defined by a formal expression to be the set of all 
points for which the expression is meaningful. Give the domain and 
range of each of the following functions: 


(a) z=vx +y (i) z = /3 — x? — 2y? 
(b) z = y2x — y? Q) z= v=x — y? 
(c) z = ES (k) z = log (x? — y?) 
~ y8 y2 x2 
@z2=/1-%-% (l) z = arc tan ap 
_ _ x 
(e) z = log (x + 5y) (m) z= arc tany Fy 
(f)z= Vx sin y (n) z = cos arc tan” 
(g) w = Va? — x2 — y2 — z? (o0) z = arc cos log (x + y) 
2—2 o 
(h) 2 = F (p)z= Vy cos x. 


3. What is the number of coefficients of a polynomial of degree n in two 
variables? In three variables? In k variables? 


4, For each of the following functions sketch the contour lines correspond- 
ing to z= —2, —1, 0, 1, 2, 3: 


(a) z= xy 

(o) z= x? + y2-—1 
(c) z= x? — y? 

(d) z= 7? 


(e) z=y (1-7): 
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5. Draw the contour lines for z = cos (2x + y) corresponding to z = 0, 
+ 1, + 1/2. 
6. Sketch the surfaces defined by 


(a) z = 2xy 

(b) z= x? + y? 
(c)z=x-—Yy. 
(d) z= x? 


(e) z = sin (x + y). 
7. Find the level lines of the function 


1+ VF 
VH 


8. Find the surfaces on which the function u = 2 (x? + y?)/z is constant. 


z = log 


1.3 Continuity 


a. Definition 


As in the theory of functions of a single variable, the concept of con- 
tinuity figures prominently when we consider functions of several 
variables. The statement that the function u = f(x, y) is continuous 
at the point (€, n) should mean, roughly speaking, that for all points 
(x, y) near (&, n) the value of f(x, y) differs but little from the value 
f(E, n). We express this idea more precisely as follows: If f has the 
domain R and Q = (&, n) is a point of R, then f is continuous at Q if 
for every £ > 0 there exists a ò > 0 such that 


(1) IAP) — AQ) = Ifx, y) — FE, n| <e 
forall P = (x, y) in R for which} 
(2) PQ = v(x — E2 + (y — n} <6. 


If a function is continuous at every point of a set D of points, we say 
that it is continuous in D. 
The following facts are almost obvious: The sum, difference, and 


1Instead of confining (x, y) to a small disk with center (€, n) we could use a small 
square. Thus condition (2) in the definition of continuity can be replaced by 


(2’) Ix —E|<d and ly —n| < ô. 
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product of continuous functions are also continuous. The quotient 
of continuous functions defines a continuous function at points where 
the denominator does not vanish (for the proof see the next section, 
p. 00). In particular, all polynomials are continuous, and all rational 
functions are continuous at the points where the denominator does 
not vanish. Continuous functions of continuous functions are them- 
selves continuous (cf. p. 22). 

A function of several variables may have discontinuities of a much 
more complicated type than a function of a single variable. For 
example, discontinuities may occur along whole arcs of curves, not 
just at isolated points. This is the case for the function defined by 


u=y/x for x Æ Q0; u= 0 for x = 0, 


which is discontinuous along the whole line x = 0. Moreover, a 
function f(x, y) may be continuous in x for each fixed value of y and 
continuous in y for each fixed value of x, and yet be discontinuous as 
a function of the point (x, y). This is exemplified by 


fle, =a for (x, 9) # (0,0), F00 =0. 


For any fixed y + 0, this function is obviously continuous as a 
function of x, as the denominator cannot vanish. For y = 0 we have 
f(x, 0) = 0, which also is continuous as a function of x. Similarly, 
f(x, y) is continuous as a function of y for any fixed x. But at every 
point of the line y = x except at the point x = y = 0 we have f(x, y) = 
1, and there are points of this line arbitrarily close to the origin. 
Hence, f(x, y) is discontinuous at the point (0, 0). 

Just as in the case of functions of a single variable, a function 
{(P) = f(x, y) is called uniformly continuous in the set R of the x, y- 
plane if f is defined at the points of R and if for every € > 0 there exists 
a positive 5 = (e) such that |f(P) — f(Q)| < £ for any two points 
P, Qin R of distance < 6.1 The quantity 5 = (e) is called a modulus 
of continuity for f. We have the basic theorem: 

A function f that is defined and continuous in aclosed and bounded 
set R is uniformly continuous in R. (For the proof see the Appendix 
to this chapter.) 

Particularly important is the case in which we can find a modulus 
of continuity that is proportional to € (see Volume I, p. 43). The 


1The essential requirement making the continuity uniform is that 5 depends on € but 
not on P or Q. 
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function f(P) defined in R is called Lipschitz-continuous if there 
exists a constant L such that 


(3) If(P) —f(Q)|S L PQ forall points P, Qin R. 


(L is called the “Lipschitz constant,” relation (3) the “Lipschitz 
condition.’’) It is clear that a Lipschitz-continuous function f is 
uniformly continuous and has 6 = €/L as modulus of continuity.! 


b. The Concept of Limit of a Function of Several Variables 


The notion of limit of a function is closely related to the notion 
of continuity. Let us suppose that f(x, y) is a function with domain 
R. Let Q = (, n) be a point of the closure of R. We say that f has the 
limit L for (x, y) tending to (6, n) and write 


(4) lim /f(x,y=L or lim f(P) = L, ? 


(x,y) >(&, n) 
if for every £ > 0 we can find a neighborhood 
(5) PQ = V(x — FF + (y-n? <8 
of (€, n) such that 
IAP) — Ll =|f(w, y) - L| <e 


for all P = (x, y) belonging to R in that neighborhood.® 

In case the point (€, n) belongs to the domain of f we have in (x, y) = 
(€, n) a point of R satisfying (5) for all 5 > 0. Then (4) implies in 
particular that 


IfG, n) — Ll<e 


1The still wider class of ‘‘H6lder-continuous’” functions fis obtained when we replace 
the Lipschitz condition (3) by the Hélder condition 


If(P) —fF(Q)|SL PQ* for all P,Q in R. 

L and a are constants and 0 <a < 1 (see Volume I, p. 44). These functions also 
are uniformly continuous, and we can choose as modulus of continuity the quantity 
ô = (e/L)1/4 

2Or else lim f(x, y) = L for (x, y) > (&, n) or lim f(x, y) = L. 
yom 
3The notion makes no sense for points (£, n) exterior to R since then there exist no 


points arbitrarily close to (€, n) in which fis defined, and every L could be con- 
sidered as limit. 
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for all e > 0 and hence that L = f(t, n). But then, by definition, the 
relation 


lim f(x, y) = FE, n) 


(x, NE n 


is identical with the condition for continuity of f at (€, n). Hence, 
continuity of the function f at the point (¢, n) is equivalent to the statement 
that f is defined at (é, n) and that f(x, y) has the limit f(é, n) for (x, y) 
tending to (é, n). 

If f is not defined at the boundary point (6, n) of its domain but has 
a limit L for (x, y) > (£, n), we can naturally extend the definition of 
f to the point (£, n) by putting f(&, n) = L; the function f extended in 
this way will then be continuous at (£, n). If f(x, y) is continuous in 
its domain R, we can extend the definition of f as limit not just to a 
single boundary point (£, n) but simultaneously to all boundary points 
of R for which f has a limit. The resulting extended function is 
again continuous, as the reader may verify as an exercise. Take, for 
example, the function 


f(x, y) = erly 


defined for all (x, y) with y > 0. This function obviously is continuous 
at all points of its domain R, the upper half-plane. Consider a bounda- 
ry point (E, 0). For € = 0 we have clearly 


lim f(x, y) = lm e~ = 0 
(x, WG, n) s>% 


when y is restricted to positive values. If then we define the extended 
function f*(x, y) by 


f*(x, y) = f(x, y) =e 
for y >0 and all x, and by 
f*(x, 0) = 0 


for x + 0. the function f* will be continuous in its domain R* where 
R* is the closed upper half-plane y = 0 with the exception of the 
point (0, 0). At the origin f* does not have a limit, and hence it is not 
possible to define f*(0, 0) in such a way that the extension 1s con- 
tinuous at the origin. Indeed, for (x, y) on the parabola y = kx*, we 
have 
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f(x, y) = eM, 


Approaching the origin along different parabolas leads to different 
limiting values, so that there exists no single limit of f(x, y) for (x, y) 
— 0. 

We can also relate the concept of limit of a function f(x, y) to that 
of limit of a sequence (cf. Volume I, p. 82). Suppose f has the domain 
R and 


lim = f(x, y) = L. 


(x, WE, n) 
Let Pn = (Xn, yn) for n = 1,2, . . ., be any sequence of points in R for 
which lim Pn = (E, n). Then the sequence of numbers f(xn, yn) has the 


limit L. For f(x, y) will differ arbitrarily little from L for all (x, y) in R 
sufficiently close to (E, n), and (xn, yn) will be sufficiently close to (£, n) 
if only n is sufficiently large. Conversely, lim f(x ,y) for (x, y) > (£, n) 


exists and has the value L if for every sequence of points (xn, yn) in 
R with limit (€, n) we have lim f(xn, yn) = L. The proof can easily be 
noo 


supplied by the reader. If we restrict ourselves to points (€, n) in the 
domain of f, we obtain the statement that continuity of f in its domain 
R means just that 


©) lim fn, Yn) = fE, n) 
whenever lim (xn, Yn) = (&, n) or that 
lim f(Xn, yn) = f(lim xn, lim yn), 


where we only consider sequences (Xn, yn) in R that converge and have 
their limits in R. Essentially, then, continuity of a function f allows 
the interchange of the symbol for f with that for limit. 

It is clear that the notions of limit of a function and of continuity 
apply just as well when the domain of fis not a two-dimensional region 
but a curve or any other point set. For example, the function 


f(x+y)=(x+y)! 


is defined in the set R consisting of all the lines x + y = const. = n, 
where n is a positive integer. Obviously, f is continuous in its domain 
R. 
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It was mentioned earlier (p. 17) that when f(x, y) and g(x, y) are 
continuous at a point (E, n), then f + g, f — g, f+ g, and for g(E, n) 4 0 
also fjg are continuous at (&, n). These rules follow immediately 
from the formulation of continuity in terms of convergence of se- 
quences. For any sequence (xn, yn) of points belonging to the domains 
of f and g and converging to (6, n), we have by (6) 


lim f(xn, Yn) = fG, 0), lim &(xn, Yn) = gS, N). 


The convergence of f(xn, Yn) + g(Xn, yn) and so on follows then from 
the rules for operating with sequences (Volume I, p. 72). 


c. The Order to Which a Function Vanishes 


If the function f(x, y) is continuous at the point (€, n), the difference 
f(x, y) — f(E, n) tends to 0 as x tends to Ẹ and y tends to n. By intro- 
ducing the new variables h = x — € and k = y — n, we can express 
this as follows: The function ¢(h, k) = f(E + h, n + k) — f(E, n) of 
the variables h and k tends to 0 as h and k tend to 0. 

We shall frequently meet with functions g(h, k) which tend to 0 as 
h and k do. As in the case of one independent variable, for many 
purposes it is useful to describe the behavior of ¢(h, k) for h > 0 and 
k > 0 more precisely by distinguishing between different “orders of 
vanishing” or “orders of magnitude” of g(h, k). For this purpose we 
base our comparisons on the distance 


p = Vk? + kè? = v(x — EP + (y — n) 


of the point with coordinates x = & + handy = n + k from the point 
with coordinates € and n and make use of the following definition: 

A function g(h, k) vanishes as p > 0 to at least the same order as 
p = Vh? + k?, provided that there is a constant C independent of 
h and k such that the inequality 


xc 


holds for all sufficiently small values of p; that is, provided there is a 
5 > 0 such that the inequality holds for all values of h and k such that 


1In order to avoid confusion, we expressly point out that a higher order of vanishing 
for p > 0 implies smaller values in the neighborhood of p = 0; for example, p? van- 
ishes to a higher order than p and p? is smaller than p when p is nearly 0. 
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0< Vh? + k? <5. We write, then, symbolically: g(h, k) = O(p). Further, 

we say that g(h, k) vanishes to a higher order! than p if the quotient 

d(h, k)|p tends to 0 as p > 0. This will be expressed by the symbolical 

notation g(h, k) = o(p) for (h, k) > 0 (see Volume I, p. 253, where the 

symbols ‘“‘o” and “O” are explained for functions of a single variable). 
Let us consider some examples. Since 


[RI 


J45! and eye =) 


the components h and k of the distance p in the direction of the x 
and y-axes vanish to at least the same order as the distance itself. The 
same is true for a linear homogeneous function ah + bk with con- 
stants a and b or for the function p sin 1/p. For fixed values of a greater 
than 1, the power p* of the distance vanishes to a higher order than 
©; symbolically, pe = o(p) for a> 1. Similarly, a homogeneous 
quadratic polynomial ah? + bhk + ck? in the variables h and k 
vanishes to a higher order than p as p > 0: 


ah? + bhk + ck? = o(p). 


More generally, the following definition is used. If the comparison 
function w(h, k) is defined for all nonzero values of (h, k) in a sufficient- 
ly small circle about the origin and is not equal to 0, then g(h, k) 
vanishes to at least the same order as w(h, k) as p > 0 if for some suit- 
ably chosen constant C the relation 


g(h, k) 
olh, D| sC 


holds in a neighborhood of the point (h, k) = (0, 0). We indicate this 
by the symbolic equation g(h, k) = O(o(h, k)). Similarly, g(h, k) 
vanishes to a higher order than o(h, k), or d(h, k) = o(a(h, k)), if 
g(h, k) 
(h, k) 

For example, the homogeneous polynomial ah? + bhk + ck? is at 
least of the same order as p2, since 


—+ 0 when p> 0. 


lah? + bhk + ck?| < | a] + 5b + Jel | (h2 + k?) 


Also p = o(1/|log p|), since lim (p log p) =0 (Volume I, p. 252). 
p>0 


24 


co CO 1 OC 
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Exercises 1.3 


. The function z = (x — y)/(x + y) is discontinuous along y = —x. Sketch 


the level lines of its surface for z = 0, +1, +2. What is the appearance 
of the level lines for z = +m, and m large? 


. Examine the continuity of the function z = (x? + y)—vx? + y?, where 


z= 0 for x = y = 0. Sketch the level lines z = k (k = —4, —2, 0, 2, 4). 
Exhibit (on one graph) the behavior of z as a function of x alone for y 
=—2, —1, 0, 1, 2. Similarly, exhibit the behavior of z as a function of 
y alone for x = 0, +1, +2. Finally, exhibit the behavior of z as a function 
ofp alone when 9 is constant (pọ, 0 being polar coordinates). 


. Verify that the functions 


(a) f(x, y) = x? — 3xy? 

(b) g(x, y) = xt — 6x?y? + yt 

are continuous at the origin by determining the modulus of continuity 
è(e). To what order does each function vanish at the origin? 


. Show that the following functions are continuous: 


(a) sin (x? + y) 

b) — sin xy xy 
Vx? + y? 
x3 + y? 

(c) x2 + y? 


(d) x? log (x? + y?) 
where in each case the function is defined at (0, 0) to be equal to the 
limit of the given expression. 


. Find a modulus of continuity, è = è(e, x, y), for the continuous func- 


tions 
(a) f(x, y) = V1 + x? + 2y? 
(b) f(x, y) = v1 + e”. 


. Where is the function z = 1/(x? — y2?) discontinuous? 

. Where is the function z = tan ry /cos nx discontinuous? 

. For what set of values (x, y) is the function z = yy cos x continuous? 

. Show that the function z = 1/(1 — x? — y?) is continuous in the unit 


disk x? + y2 < 1. 


. Find the condition that the polynomial 


P = ax? + 2bxy + cy? 
has exactly the same order as p? in the neighborhood of x = 0, y = 0 
(i.e., that both P/o? and ọ?/P are bounded). 
Find whether or not the following functions are continuous, and if 
not, where they are discontinuous: 


in Y 
(a) sin x 


13. 


14. 


15. 


16. 


17. 


18. 
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oy oe 
(c) > + 
@ a 
Show that the functions 
fe D= Bape: TEET 


tend to 0 if (x, y) approaches the origin along any straight line but that 
f and g are discontinuous at the origin. 

Determine whether the following functions have limits at x = y = 0 
and give the limit when it exists. 


g(x, y) = 


x2 — y2 
(a) 24 y? (e) exp [— |x — y|/(x? — 2xy + y?) 
x? + 2xy + y? 
O) “ayy (f) |x|” 
2 2 
x" + 3xy + y“ (g) |x] tay! 


(C) ay 4xy + y? 


___|x—y| o yl?! vx + 

(d) x? — 2xy + y? (b) a nl 
x2 +y? + | y/x| 

Find a modulus of continuity è(e) for those functions of Exercise 14 
that have limits at x = y = 0, where the functions are defined at the 
origin by their limiting values. 
Show that f(x, y, z) = (x? + y? — z2?)/(x? + y2 + 2?) is not continuous at 
(0, 0, 0). 
Prove that if P(x, y) and Q(x, y) are each polynomials of degree n > 0, 
vanishing at the origin, 


P(x, y) 
Q(x, y) 


R(x, y) = 


is not continuous at the origin. 
Find the limits of the following expressions as (x, y) tends to (0, 0) in an 
arbitrary manner: 


sin (x? + y?) 
(a) x2? + y? 


sin (x4 + y’) 
O r 
e- 1/(z2+y2) 


(c) x4 + y4 ° 
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19. Show that the function z = 3(x — y)/(x + y) can tend to any limit 
as (x, y) tends to (0, 0). Give examples of variations of (x, y) such that 


(a) lim z=2 
(b) lim z= —1 


(c) lim z does not exist 
i 
20. If f(x, vy) - 0 as (x, y) — (0, 0) along all straight lines passing through the 
origin, does f(x, y) — 0 as (x, y) — (0, 0) along any path? 
21. Oo the behavior of z = y log x in a neighborhood of the origin 
, 0). 
22. For z = f(x, y) = (x? — y)/2x, draw the graphs of 


(a) z = f(x, x?) 


(b) z = f(x, 0) 
(c) z = f(x, 1) 
(d) z = f(x, x) 


Does the limit of f(x, y) as (x, y) — (0, 0) exist? 
23. Give a geometrical interpretation of the following statement: ¢(h, k) 
vanishes to the same order as ep = Vh2 + k2. 


Problems 1.3 


1. Let the continuous function f be extended to the function f* defined so 
that f* = f on the domain of f and f*(Q) = lim f(P) for all points Q on 
the boundary of f where the limit exists. Prove that f* is continuous. 


2. Prove that lim f(x, y) for (x, y)— (Č, n) exists and has the value L if 
and only if for every sequence of points (xn, yn) in the domain of f with 
limit (č, n) we have lim f(xn, yn) = L. 


noo 


1.4 The Partial Derivatives of a Function 


a. Definition. Geometrical Representation 


If in a function of several variables we assign definite numerical 
values to all but one of the variables and allow only that variable, 
say x, to vary, the function becomes a function of a single variable. We 
consider a function u = f(x, y) of the two variables x and y and 
assign to y a definite fixed value y = yo = c. The resulting function 
u = f(x, yo) of the single variable x may be represented geometrically 
by cutting the surface u = f(x, y) by the plane y = yo (cf. Figs. 1.13 
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Figure 1.13 and Figure 1.14 Sections of u = f(x, y). 


and 1.14). The curve of intersection thus formed in the plane is re- 
presented by the equation u = f(x, yo). If we differentiate this function 
in the usual way at the point x = xo, assuming that f is defined in a 
neighborhood of (xo, yo) and that the derivative exists,! we obtain the 
partial derivative of f(x, y) with respect to x at the point (xo, yo): 


lim f(xo + h, yo) — f(xo, yo) 
h>0 h 


Geometrically, this partial derivative denotes the tangent of the 
angle between a parallel to the x-axis and the tangent line to the 
curve u = f(x, yo). It is therefore the slope of the surface u = f(x, y) in 
the direction of the x-axis. 

To represent these partial derivatives several different notations 
are used, one of which is the following: 


pa . Aen fz(x0, Yo) = Uz(xo, yo). 


h->0 


If we wish to emphasize that the partial derivative is the limit of a 
difference quotient, we denote it by 


Here we use the special round letter 0 instead of the ordinary d used 
in the differentiation of functions of one variable in order to show 
that we are dealing with a function of several variables and differenti- 
ating with respect to one of them. 

1We shall not try to define a derivative at boundary points of the domain (except, 


on occasion, as limit of the values of partial derivatives as the boundary point is 
approximated by interior points). 
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For some purposes it is convenient to use Cauchy’s symbol D (men- 
tioned on p. 158 of Volume J) and to write 


but we shall seldom use this symbol. 
In exactly the same way we define the partial derivative of f(x, y) 
with respect to y at the point (xo, yo) by the relation 


um Most 5 — fie Je) fy(xo, Yo) = Dyf(xo, yo). 


This represents the slope of the curve of intersection of the surface 
u = f(x, y) with the plane x = xo perpendicular to the x-axis (Fig. 
1.14). 

Let us now think of the point (xo, yo), hitherto considered fixed, as 
variable and accordingly omit the subscripts 0. In other words, we 
think of the differentiation as carried out at any point (x, y) of the 
region of definition of f(x, y). Then the two derivatives are themselves 
functions of x and y, 


ust, y) = felt, 9) = FED and unl, y) = fle, y) = FY? 


For example, the function u = x? + y? has the partial derivatives 
Uz = 2x (in differentiation with respect to x the term y? is regarded 
as a constant and so has the derivative 0) and uy = 2y. The partial 
derivatives of u = x3y are uz = 3x*y and uy = x’. 

Similarly, for a function of any number n of independent variables, 
we define partial derivatives by 


Of(x1, X2,.. . Xn) _ lim f(xi + h, x2, . . . , Xn) — f(x1, X2, . . . Xn) 
0x1 R90 h 
= fr (x1, X2, . . . , Xn) = Dz, f(x1, x2, . . . , Xn), 


it being assumed that the limit exists. 

Of course, we can also form higher partial derivatives of f(x, y) by 
again differentiating the partial derivatives of the “first order,” 
fx(x, y) and f,(x, y), with respect to one of the variables and repeating 
this process. We indicate the order in which the differentiations are 
carried out by the order of the subscripts or by the order of the 
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symbols dx and dy in the “denominator” from right to left! and use 
the following symbols for the second derivatives: 


2 (£) — i = fen = (Dof, 
2 24) _ sh = fry = DzDyf, 
2 (ar - ahs = fys = DyDaf, 
2 (ah = F$ = fov = (Dof. 


We likewise denote the third partial derivatives by 


o (3f _ o3f _ 

az (az = gys 7 [eens 
D f g 
ay (aa) T Jy ax? T T97 


aN E y 
5 (x ay) ax y 177 


and so on, and in general the nth derivatives by 


2 (et = of = fan, 


ax\dx"—1} ax” 
0a" A Of 
ay lane} ~ oy oxri =fyzn-1, 


and so on. 

The different notations for partial derivatives have their respective 
advantages. Writing of(x, y)/ðx or Dzf(x, y) for the partial derivative 
of the function f(x, y) with respect to its first argument emphasizes 
that differentiation has the character of an operator Dz or 0/dx acting 
on the function, written symbolically as a factor multiplying the 
function. The notation for higher derivatives is consistent with this 
idea of a product: 


a (a g? 
aylaz!) = ay axl = DoD 


1This is consistent with the general notation for symbolic products of operators (see 
Volume I, p. 53). Actually, the order in which differentiations are carried out turns 
out to be immaterial in most cases of interest (see p. 36). 
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A disadvantage of the operator notation is its clumsiness when it 
comes to indicating for what values of the independent variables the 
derivatives are taken. For example, if f(x, y) = x2 + 2xy + 4y?, then 
its x-derivative at the point x = 1, y = 2 can be written as 


(ie 2) = fa(l, 2) = (2x + 2y) _, = 6. 
x x=1 x=) 

y=2 y 
We should not write it simply as 


of, 2) 
Ox 


since f(1, 2) has the constant value 21 and hence has 0 as its x-deriv- 
ative. 

Just as in the case of one independent variable, the possession of 
derivatives is a special property of a function, not enjoyed even by all 
continuous functions.! All the same, this property is possessed by all 
functions of practical importance, except perhaps at isolated ex- 
ceptional points or curves. 


Exercises 1.4a 


1. Find 6z/0x, oz/0y for each of the following: 


(a) z = ax" + by”, a, b, m, n constants (h) z = 37/y 


(b) z = 2xev” + By (i) z = log (x + 2) 
()z= 25435 (j) z = cos (x2 + y) 
(d) z = arc tan 2 (k) z = tan (xy? + e7) 
(e) z = x?y?!2 Q z= sin > 

(€) z= y? (m) z = xe” + yet 

(B) z = xi? ysis (n) z = xy x F y? 


2. Find the first partial derivatives of the following: 


TETT ee ee 
(a) Veh +y D Fary te 
(b) sin (x2 — y) (e) y sin xz 


1For an explanation of the term “differentiable”, which implies more than that the 
partial derivatives with respect to x and y exist, see pp. 41-42. 
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(c) et (£) log V1 + x? + y? 


. Find all the first and second partial derivatives of the following: 


(a) xy 

(b) log xy 

(c) tan (arc tan x + arc tan y) 
(d) x” 


(e) e” 


. Let w = f(x, y, z) = (cos x/sin y)e*. Find fzr, fy, fz, for x = r, y = 7/2, 


z = log 3. 


. For f(x, y) = y cosh x + x sinh y, find fz? + fy? at x = 0, y = 0. 
. Show that the functions u =e* cos y, v = e* sin y, satisfy the con- 


ditions Uz = Vy, Uy = —Uz. 


. Show that the functions of Exercise 6 satisfy the partial differential 


equation 
f. zz + f yy = 0. 
Do the same for the functions 
(a) log Vx? + y? 
y 
(b) arc tan 


y 
© aia 


(d) 3x2y — y? 
(e) Vx +vx7F y? 


. For r= yx + y? + z2, find rez + Fyy + Pez. 
. Find a constant a for which if z = y3 + ayx?, then zzz + zyy = 0. 
10. 


Prove that the function 
1 


f(x1, X2, oe %9 Xn) = (x12 + X22 -+ eee + Xn2)\n—2)/2 


satisfies the equation 
friz + fraz +e oe ¢ + frntn = 0. 


Problems 1.4a 


. How many nth derivatives has a function of three variables? of k varia- 


bles? 


. Give an example of a function f(x, y) for which fz exists and fy does not. 
. Find a function f(x, y) that is a function of (x? + y?) and is also a product 


of the form (x) }(y); that is, solve the equation 
f(x, y) = G(x? + y?) = U(x) (y) 


for the unknown functions. 
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4. Prove that any function of the form 


t+r t-r 
u(x, y, 2) = PED 4 et 
(where r? = x? + y? + 22), satisfies the equation 


Urr + Uyy + Uzz = Utt. 


b. Examples 


In practice, partial differentiation involves nothing that the 
student has not already met. For, according to the definition, all the 
independent variables are to be kept constant except the one with 
respect to which we are differentiating. Therefore, we have merely to 
regard the other variables as constants and carry out the differenti- 
ation according to the rules by which we differentiate functions of a 
single independent variable. We list some partial derivatives of 
several simple functions. 


1. Function: 
f(x, y) = xy 
First derivatives: 
fe =y, fy=x 
Second derivatives: 
fzz = 0, fay = fys = 1, fuy = 0 
2. Function: 
f(x, y) = vx? + y? 
First derivatives: 


x 
k= SS 
Vx? +y Vx? + 


[Thus, for the radius vector r = /x2 + y2 from the origin to the point 
(x, y), the partial derivatives with respect to x and to y are given by cos¢ 
= xjr, and sin ø = y/r, where ¢ is the angle that the radius vector 
makes with the positive direction of the x-axis.] 

Second derivatives: 
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fra = _ A = sin? g 
vx? + y2) r’ 
fa = fys = — — 2 = Sn $ cos f 
ad V(x? + y?) r 


fyy = x? _ cos? ¢ 


3. Reciprocal of the radius vector in three dimensions: 


1 1 

X, 2 = T E> 

MOY, 2) = EF T 

First derivatives: 
f 2 OMe 
7 +yz r 
fs =- a => > 
V(x? + y? + 22) r 

f, = z z. 
2z — 9 


Very r ep 7 


Second derivatives: 


— 1 8x 1. 3y? 1. 322 
fea =— 3+ 75> fyy =— 33 + Ts» fa =— at 75> 
3X, 3yZ 32x 
fay = fyz = F, fuz = fu = ae fea = faz = —. 
r r 
From this we see that for the function f = ——=—=———— — tl the 
Vx + y? + 2? 
equation 


B(x? + y? + 27) 
r5 7 


3 
faa + fuy + fee =- 73 + 0 
holds for all values of x, y, z except 0, 0, 0; we say, the function 
f(x, y, z) = 1/r satisfies the partial differential equation (“Laplace 
equation”) 
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fax + fuy + fez = 9. 


4, Function: 


f(x, y) = l tea) say 
vy 


First derivatives: 


—(xX —a@ 2 
fz — a” 378 ) o—(z-a) /ay 
—1 (x _ a)? —ixz-a)? 
fy = (n + gys Je (z-a) 139 


Second derivatives: 


—] . (x-—a)* n2 
fea = (gpa + gar JOO 


— — a) 
fay = fyz = H T (x — a)" a) Jeary, 


4 y5) 8y7/2 
3 1 1 (x — a) (x — a) _,,_,2 
fw = (ayaa ie + ape J 


The partial differential equation fzz — fy = 0 is therefore satisfied 
identically in x and y. 


c. Continuity and the Existence of Partial Derivatives 


For a function of a single variable, the existence of the derivative 
at a point implies the continuity of the function at that point (cf. 
Volume I, p. 166). In contrast to this, the possession of partial deriv- 
atives does not imply the continuity of a function of two variables: 
for example, the function u(x, y) = 2xy/(x? + y?), with u(0, 0) = 0, has 
partial derivatives everywhere, and yet we have already seen (p. 18) 
that it is discontinuous at the origin. Geometrically speaking, the 
existence of partial derivatives restricts the behavior of the function in 
the directions of the x- and y-axes only and not in other directions. 
Nevertheless, the possession of bounded partial derivatives does imply 
continuity, as is stated by the following theorem: 
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If a function f(x, y) has partial derivatives fz and fy everywhere in an 
open set R, and these derivatives everywhere satisfy the inequalities 


where M is independent of x and y, then f(x, y) is continuous everywhere 
in Rt 

For the proof, we consider two points with coordinates (x, y) and 
(x + h, y + k), respectively, both lying in the region R. We further 
assume that the two line segments joining these points to the point 
(x + h, y) both lie entirely in R; this is certainly true if (x, y) is a 
point interior to R and the point (x + h, y + k) lies sufficiently close 
to (x, y). We then have 


(7) f(x + h, y + k) — f(x,y) = {f(x + h, y + k) — f(x + h, y)} 


The two terms in the first bracket on the right differ only in y; those 
in the second bracket, only in x. We can therefore apply the ordinary 
mean value theorem of the differential calculus (Volume I, p. 174) to 
the first bracket as a function of y alone and to the second bracket as 
a function of x alone. We thus obtain the relation 


(8) f(x + h, y + k) — f(x, y) = kfylx + h, y + O1k) + hfa(x + 82h, y), 


where 0; and 92 are numbers between 0 and 1. In other words, the 
derivative with respect to y is to be formed for a point of the vertical 
line joining (x + h, y) to (x + h, y + k), and the derivative with re- 
spect to x is to be formed for a point of the horizontal line joining 
(x, y) and (x + h, y). Since by hypothesis both derivatives are less 
than M in absolute value, it follows that 


(9) f(x + h, y + k) — f(x, y) S M(|h/ + Ikl). 


For sufficiently small values of h and k the right-hand side is itself 
arbitrarily small, and the continuity of f(x, y) is proved.? 


1This applies even, as the proof shows, to boundary points of the domain, provided 
they can be joined to any neighboring points of the domain by a broken line consist- 
ing of two segments parallel to the axes and f is defined properly at the boundary 
point. 

2If the domain of f is a rectangle with sides parallel to the axes, the inequality holds 
for any two points (x, y) and (x + h, y + k) in the domain. It follows then that f is 
even Lipschitz-continuous (see p. 19). 
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Exercises 1.4c 


1. State and prove for a function of three variables f(x, y, z) that the 
existence and boundedness of the first partial derivatives are sufficient 
for the continuity of f. 

2. Show that the following functions f(x, y) are continuous: 


—1/(z2 + y2) 
(a) fe, 9) = fe nara 


0, x=0,y=0 
4 4 2 2 
Ofans [FID 108 +9 9 #0 


d. Change of the Order of Differentiation 


In all examples of partial differentiation given on pp. 32-34 we find 
that fyz = fzy; in other words, it makes no difference whether we 
differentiate first with respect to x and then with respect to y or first 
with respect to y and then with respect to x. This is true generally 
under the conditions of the following theorem: 


If the “mixed” partial derivatives fry and fyz of a function f(x, y) are 
continuous in an open set R, then the equation 


(10) fuys = i: zy 


holds throughout R; that is, the order of differentiation with respect to 
x and to y is immaterial. 

The proof, like that of the previous subsection, is based on the 
mean value theorem of the differential calculus. We consider the 
four points (x, y), (x + h, y), (x, y + k), and (x + h, y + k), where 
h + 0 and k + 0. If (x, y) is a point of the open set R and if h and k are 
small enough, all four of these points belong to R. We now form the 
expression 


(11) A=f(x+h,y + k) — f(x + h, y) — f(x,y + k) + f(x, y). 
By introducing the function 


of the variable x and regarding the variable y merely as a “parameter,” 
A assumes the form 


A = g(x + h) — g(x). 


Applying the mean value theorem of differential calculus yields 
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A = h¢'(x + 9h), 


where 0 lies between 0 and 1. From the definition of g(x), however, 
we have 


g'(x) = falx, y + k) — falx, Y), 


and since we have assumed that the “mixed” second partial derivative 
fuz does exist, we can again apply the mean value theorem and find that 


where 9 and 0’ denote two unspecified numbers between 0 and 1. 
In exactly the same way we may introduce the function 


wy) = f(x + h, y) — f(x, y) 


and express A as 
A = y(y + k) — y(y). 
We thus arrive at the equation 
A = hRfay(x + 91h, y + 61'R), 


where 0 < 01 <1 and 0 < 6:' < 1, and if we equate the two ex- 
pressions for A, we obtain the equation 


fus(x + Oh, y + OR) = fry(x + ıh, y + Ork). 


If here we let h and k tend simultaneously to 0 and recall that the 
derivatives fzy(x, y) and fyz(x, y) are continuous at the point (x, y), 
we immediately obtain 


fuclx, y) = fay(x, y), 
which was to be proved.! 


‘For more refined investigations it is often useful to know that the theorem on the 
reversibility of the order of differentiation can be proved with weaker hypotheses. 
It is, in fact, sufficient to assume that in addition to the first partial derivatives fz and 
fy, only one mixed partial derivative, say fyz, exists and that this derivative is 
continuous at the point in question. To prove this, we return to equation (11), divide 
by hk, and then let k alone tend to 0. Then the right-hand side has a limit, and there- 
fore the left-hand side also has a limit, and 


-n A _ f(x +h, y) — falx, y) 
lim “2 — LX T h, Y) — Ty(%, y) 

b-0 kh h 
Further, it was proved above with the sole assumption that fyz exists that 


A 
hk = fyz(x + Oh, y + O'R). 


By virtue of the assumed continuity of fyz, we find that for arbitrary £ > 0 and for 
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The theorem on the reversibility of the order of differentiation 
(i.e., on the commutativity of the differentiation operators Dz and Dy) 
has far-reaching consequences. In particular, we see that the number 
of distinct derivatives of the second order and of higher orders of 
functions of several variables is decidedly smaller than we might at 
first have expected. If we assume that all the derivatives that we are 
about to form are continuous functions of the independent variables 
in the region under consideration and if we apply our theorem to the 
functions fz(x, Y), fy(x, Y), fry(x, y), and so on, instead of to the function 
f(x, y), we arrive at the equations 


fray = fay = fyzz, 
fayy = fusy = fuys, 


f. szryy = f. xzyzy — f. zyys = f. ystry = f. yzsyz = f. YYLL 


and in general we have the following result: 

In the repeated differentiation of a function of two independent vari- 
ables the order of the differentiations may be changed at will, provided 
only that the derivatives in question are continuous functions.1 


all sufficiently small values of h and k 
fycl(x, y) — € < fyz(x + Oh, y + OR) < fyc(x, y) + €, 


whence it follows that 


falx + h, y) — ful, y) 
h 


fyx(x, y) —EeE Ss s fux(x, y) + € 


or 


lim ful + h, 2 — fy(x, y) — fyx(x, y)» 
h»0 
that is, 

fzy(x, y) = fy2(x, y). 


1It is of fundamental interest to show by means of an example that without the 
assumption of the continuity of the second derivative fzy or fyz the theorem need 
not be true and fzy can differ from fyz. This is exemplified by the function 
x2 — y? 
f(x, y) = XY 2p ya’ f(0, 0) =0, 
for which all the partial derivatives of second order exist but are not continuous. 
We find that 


n fx —-fOy” a eLO 
fz(0, y) = lim x = lim y xF yi y, 

_ 4. f(x, ¥) — f(x, 0) _ ,. x2 — y? 
fu, 0) = lim ye = him x ep ye To 
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With our assumptions about continuity, a function of two variables 
has three partial derivatives of the second order, 


fra, fry, fuy; 
four partial derivatives of the third order, 


fzzz, fray, fzyy, fuyy3 


and in general (n + 1) partial derivatives of the nth order, 


frn, f nly: fy t<2y2s es 8 e3 fryn, fyn. 


It is obvious that similar statements also hold for functions of more 
than two independent variables. For we can apply our proof equally 
well to the interchange of differentiations with respect to x and z or 
with respect to y and z, and so on, for each interchange of two succes- 
sive differentiations involves only two independent variables at a 
time. 


Exercise 1.4d 


1. Obtain 022/(dx oy) and 3?z|/(3y ox) to confirm their equality. 


(a) z = (ax + by)? (d) z =y e7 
(b) z = Vax + by (e) z = log == 
(c) z =f (ax + by) (£) z = ecos(y?+z) 


2. Find all partial derivatives through the third order of the following 
functions: 
(a) f(x, y) = x” 
(b) f(x, y) = cosh xy 
(c) f(x, y) = ax? + bxy + cy? 
_ x y 
(d) f(x, y) = y + = 


(e) f(x, y) = 2 cos x + 3 sin (y — x). 
3. Show for f(x, y) = log (e7 + e”) that fz + fy = 1 and fez fuy — (fry)? = 


Problems 1.4d 


1. (a) Show that a function of the form u(x, y) = f(x) g(y) satisfies the 
partial differential equation 


and consequently 

fyz(0, 0) = —1 and = fzy(0, 0) = 
These two expressions are different, which by the above theorem can only be caused 
by the discontinuity of fzy at the origin. 
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U Ury — Ugly = 0. 
(b) Prove the converse statement. 
2. Define f(x, y) as: 
2 Y _ ye x 
f(x, y) = x? arc tan x y* arc tan y? x, vy #0, 


0 for x =0 or y= 0. 
Show that fry(0, 0) = —1, fyz = 1. 


1.5 The Total Differential of a Function and Its Geometrical 
Meaning 


a. The Concept of Differentiability 


For functions y = f(x) of one variable, the existence of a derivative 
is intimately connected with the possibility of approximating the 
function f in the neighborhood of a value x by a linear function; 
geometrically, this corresponds to approximating the graph of f by its 
tangent. By definition, the function f has a derivative at the point 
x if the limit 


im Eth — f) _ A 
h>0 h 

exists; the value A of the limit is denoted by f'(x). Thus, differentia- 
bility of f at the point x means that for fixed x the increment Af = 
f(x + h) — f(x) corresponding to the increment h = Ax of the in- 
dependent variable can be written in the form 


Af = f(x + h) — f(x) = AA + eh, 
where A does not depend on h and lim e = 0. Letting x + h = € we 
h>0 


may say that f(&) is approximated by a linear function of &, namely 
gE) = f(x) + A(E — x), with an error that is of higher than the first 
order in & — x: 


FE) — gE) =e-(—x)=o0(§—--x) for >x. 


Of course, the graph of this linear function n = g(6) = f(x) + 
f'(x)\(& — x) in running coordinates &, n is just the tangent to the 
graph of f at the point (x. y). Formulated differently, differentiability 
of f at x means that the increment Af considered as a function of 
h = Ax can be approximated by the linear function df = f'(x) h = 
f'(x) dx within an error that is of higher than the first order in A." 


1For the independent variable x we have dx = 1-h = h = Ax. 
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These ideas can be extended in a perfectly natural way to functions 
of two and more variables. 

We say that the function u = f(x, y) is differentiable at the point 
(x, y) if it can be approximated in the neighborhood of this point by 
a linear function, that is, if it can be represented in the form 


(13) f(x + h, y +k) = Ah + Bk + C+ evk? + k? 


where A, B, and C are independent of the variables h and k and 
where £ tends to 0 as h and k do. In other words, the difference be- 
tween the function f(x + h, y + k) at the point (x + h, y + k) and 
the function Ah + Bk + C, which is linear in h and k, must be of 
order of magnitude o(p), where p = Vh?2 + k2 denotes the distance 
of the point (x + h, y + k) from the point (x, y). 

If such an approximate representation is possible, it follows at once 
that the function f (x, y) is continuous and has partial derivatives with 
respect to x and to y at the point (x, y) and that 


A = f(x, yY), B= f(x,y) C= f(x, y). 


For first of all we find from (13) for h = k = 0 that f (x, y) = C. More- 
over, lim fix +h, y+ k)= C = f(x, y). 

k>0 
Thus f is continuous at the point (x, y). Setting k = 0 in (13) and 
dividing by A yields the relation 


Aath DAEA te, 


Since £ tends to 0 as h tends to 0, the left-hand side has a limit, and 
that limit is A. Similarly, we obtain the equation fy(x, y) = B. 

Conversely, we shall prove the fundamental fact: 

A function u = f(x, y) is differentiable in the sense just defined— 
that is, it can be approximated by a linear function with an error o(p) 
as in (13)—if it possesses continuous derivatives of the first order 
at the point in question. 

Indeed, we can write the increment 


Au = f(x + h, y + k) — f(x, y) 
of the function in the form 


Au = f(x + h, y + k) — f(x, y + k) + f(x, y + k) — f(x, y). 
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where 0< 01, 092 < 1, using the ordinary mean value theorem of 
differential calculus. Since by hypothesis the partial derivatives fz 
and fy are continuous at the point (x, y), we can write 


falx + Oh, y + k) = falx, y) + 81 
and 
falx, y + Oak) = falx, y) + e2 
where the numbers £1 and £2 tend to 0 as h and k do. We thus obtain 
Au = hfx(x, y) + kfy(x, y) + £ıh + £2k 
= hfax, y) + Rfy(x, y) + o(vh? + k?), 


and this equation expresses the differentiability of f.1 

We shall occasionally refer to a function with continuous first 
partial derivatives as a continuously differentiable function or as a 
function of class C1. We see that functions of class C! are differentia- 
ble. If in addition all the second-order partial derivatives are con- 
tinuous, we say that the function is twice continuously differentiable, 
or of class C2, and so on. The continuous functions are also referred 
to as the functions of class C°.? 


Exercises 1.5a 


1. Show that each of the following functions is not differentiable at the 
origin: 
(a) f(x, y) = Vx cos y 
(b) f(x, y) = Vv xy] 


1If we assume merely the existence, and not the continuity, of the derivatives fz and 
fy, the function need not be differentiable (cf. p. 34). 

2These definitions of class C!, C2, and so on apply only to functions f whose domain 
is an open set, since partial derivatives have been defined only for interior points of 
the domain. One can extend the notion of class to functions f with a nonopen domain 
R; it then means that the derivatives of f in question exist at all interior points of R 
and coincide at those points with functions that are defined and continuous through- 
out R. 
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2xy 
(c) f(x, y) = baer ye? (x, y) # (0, 0) 


0, (x, y) = (0, 0). 
2. For g(x), h(y) continuous functions of x, y in the intervals [xo, xı], 
[yo, yi], respectively, show that the function f(x, y) = (a g(s) ds) x 
| f "v hA dt} is differentiable at (x, y) for xo <x < x1, Yo £ Y < Yı. 


Problems 1.5a 


1. Suppose that in a neighborhood of the point (a, b), f (x, y) = f (a, b) + 
hfz(a, b) + k fa, b) + o(VvVh?2 +k?), where h = x—a and k = y—b. On 
the assumption that fzr and fy exist at (a, b) but are not necessarily 
continuous there, prove that f is continuous at (a, b). 


b. Directional Derivatives 


A basic property of differentiable functions f is that they not only 
possess partial derivatives with respect to x and y—or, as we also 
say, in the x- and y-directions—but that they have derivatives in any 
direction and that these derivatives can all be expressed in terms of 
fz and fy. By the derivative in the direction a we mean the rate of 
change of f at the point (x, y) with respect to distance as we approach 
(x, y) along the ray that forms the angle a with the positive x-axis. 
The points (x + h, y + k) of the ray are the ones for which A and k 
have the form 


h = p cos a, k = p sin a, 


where p = yh? + k2is the distance of (x + h,y + k) from (x, y). Along 
the ray f becomes a function of p given by 


f(x + p cos a, y + p sin o). 


The derivative of f at the point (x, y) in the direction a is defined as the 
derivative of f (x + p cos a, y + p sin a) with respect to p at p=0 
and denoted by Di f(x, y). Thus, 


Dw f(x, y) = (5. f(x + p cos a, y + p sin a)| 


P= 
— lim f(x + P cos a, y + p sin a) — f(x, y) 
p>0 p ? 
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provided the limit exists. In particular, we obtain for a = 0 anda = 
n/2 the partial derivatives of f: 


Dof(x, y) = limfe t e3) — LY) — fol, 9) 


p>0 


Desof(x, 9) = lim AEI EP — FY) L fx, yy 


If f(x, y) is differentiable, we have 


(14) f(x + h, y + k) — f(x, y) = hfs + kfy + ep 
= p(fz cos a + fy sin a + £) 


Let p tend to 0; then, since e tends to 0, we obtain for the derivative 
of f in the direction a the expression 


(14a) Dw f(x, y) = fz cos a + fy sin a. 


Thus the directional derivative Dwf is a linear combination of the 
derivatives fz and fy in the x- and y-directions with the coefficients 
cos a and sin a. This result holds in particular whenever the deriva- 
tives fz and fy exist and are continuous at the point in question. 

Taking, for example, for f(x, y) the distance r = /x?2 + y2 from the 
origin to the point (x, y), we have the partial derivatives 


r: = = = = cos 8 and r ———— sin 8 
* "Vee T T yep r 


where 0 denotes the angle that the radius vector makes with the x- 
axis. Consequently, in the direction a the function r has the deriva- 
tive 


Diar = rz cos a + ry sin a = cos 9 cos a + sin 9 sin a = cos (8 — a); 


in particular, in the direction of the radius vector itself (i.e., in the 
direction away from the origin), this derivative has the value 1, while 
in the directions perpendicular to the radius vector, it has the value 0. 

The function x has, in the direction of the radius vector, the 
derivative De (x) = cos 8, and the function y, the derivative Do (y) = 
sin 8; in the direction perpendicular to the radius vector these 
functions have the derivatives Dio+2;2) x = —sin 0 and Dorr y = 
cos 0, respectively. 
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The derivative of a function f(x, y) in the direction of the radius 
vector is in general denoted by df(x,y)/ér. It is really the partial 
derivative with respect to r of f(r cos 9, r sin 8) considered as a 
function of r and 0. Thus, we have the relation 


of cos 02 + sinod, 


which we write conveniently in symbolic form as the identity 


oe cos 8 n + sin oz, 
between the differentiation operators d/dr, d/dx, d/dy. 

It is worth noting that we also obtain the derivative of the function 
f(x, y) in the direction a if, instead of allowing the point Q with 
coordinates (x + h, y + k) to approach the point P with coordinates 
(x, y) along a straight line with the direction a, we let Q approach P 
along an arbitrary curve whose tangent at P has the direction a. For 
then if the line PQ has the direction B, we can write h = p cos B, 
k = p sin f, and in the formulae (14) used in the proof above we have 
to replace a by B. But since by hypothesis B tends to a as p — 0, we 
obtain the same expression as for Dia) f(x, y). 

In the same way, a differentiable function f(x, y, z) of three in- 
dependent variables can be differentiated in a given direction. We 
suppose that the direction is specified by the cosines of the three 
angles that it forms with the coordinate axes. If we call these three 
angles a, B, y and if we consider two points (x, y, z) and (x + Ah, 
y + k, z + D, where 


h = p cos a, k = p cos B, l= p cos Y, 
then just as in (14a), we obtain the expression 
(14b) fz cos a + fy cos B + fz cos y 


for the derivative in the direction given by the angles (a, B, y). 


Exercises 1.5b 


1. What is the geometrical interpretation of the derivative Da f(x, y) of 
the function f in the direction defined by the angle of inclination «? 
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2. Find Daf (xo, yo), « = 0, 30°, 60°, 90° for the following functions: 
(a) f(x, y) = ax + by, a,-b constants, xo = yo = 0 
(b) f(x, y) = ax? + yb, xo = yo = 1, (a, b constants) 
(c) f(x, y) = x? — y?, xo = 1, yo = 2 
(d) f(x, y) = sin x + cos y, xo = yo = 0 
(e) f(x, y) = e? cos y, xo = 0, yo = T 
(f) f(x, y) = V2x?2 + y2, xo = 1, yo = 1 
(g) f(x, y) = cos (x + y), xo = 0, yo = 0. 
3. Find the directional derivatives of each of the following functions as 
indicated: 
(a) z2 — x? — y? at (1, 0, 1) in the direction of (4, 3, 0). 
(b) xyz — xy — yz — zx + x + y + z at (2, 2, 1) 
in the direction of (2, 2, 0). 
(c) xz? + y? + 2? at (1, 0, —1) in the direction of (2, 1, 0). 
4. Give an example of a function that has derivatives in every direction 
at a point yet is not differentiable at that point. 


5. Show for f(x, y) = ¥xy that f is continuous and that the partial deriva- 
tives 9z/@x and 0z/0y exist at the origin but that the directional deriva- 
tives in all other directions do not exist. 

6. Let f(x,y) = xy + VOx® + y2, r = Vx? +} y?, y/x = tan 9. Find 3?f/ðr? for 
© = 0°, 30°, 60°, 90°, and x, y = 1. 


c. Geometrical Interpretation of Differentiability. 
The Tangent Plane 


For a function z = f (x, y) all these concepts can easily be illustrat- 
ed geometrically. We recall that the partial derivative with respect to 
x is the slope of the tangent to the curve in which the surface re- 
presenting the relation z = f(x, y) is intersected by a plane perpen- 
dicular to the x,y-plane and parallel to the x-axis. In the same way, 
the derivative in the direction a gives the slope of the tangent to the 
curve in which the surface is intersected by a plane through (x, y, z) 
that is perpendicular to the x, y-plane and makes the angle a with 
the x-axis. The formula Dwf (x, y) = fs cos a + fy sin a now enables 
us to calculate the slopes of the tangents to all such curves, that 1s, of 
all tangents to the surface at a given point, from the slopes of two such 
tangents.! | 


1For points (É, n, ¢) in that plane we have € = x + p cosa, n =y + psina, and thus 
for points on the curve of intersection, 
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We have approximated the differentiable function ¢ = f(&, n) in 
the neighborhood of the point (x, y) by the linear function 


HE, n) = f(x, y) + (6 — xfr + (1 — fv, 


where € and yn are the current coordinates. Geometrically, this 
linear function is represented by a plane, which by analogy with the 
tangent line to a curve we shall call the tangent plane to the surface. 
The difference between this linear function and the function f (&, n) 
vanishes to a higher order than vh? + k2as§ — x = handn-—y=k 
tend to 0. Recalling the definition of the tangent to a plane curve, how- 
ever, this means that the line of intersection of the tangent plane 
with any plane perpendicular to the x, y-plane is the tangent to the 
corresponding curve of intersection. We thus see that all these tangent 
lines to the surface at the point (x, y, z) lie in one plane, the tangent 
plane. 

This property is the geometrical expression of the differentiability 
of the function at the point (x, y, z) where z = f(x, y). In running 
coordinates (E, n, ¢), the equation of the tangent plane at the point 
(x, y, z) is 


 — z = (E —x)fz + n — yu. 


As has already been shown on p. 41, the function is differentiable 
at a given point provided that the partial derivatives are continuous 
there. In contrast with the case of functions of one independent 
variable, the mere existence of the partial derivatives f+ and fy is not 
sufficient to ensure the differentiability of the function. If the deriva- 
tives are not continuous at the point in question, the tangent plane to 
the surface at this point may fail to exist; or, analytically speaking, 
the difference between f(x + h, y + k) and the function f(x, y) + 
hfx(x, y) + kfy(x, y), which is linear in h and k, may fail to vanish to 
a higher order than vh? + #2, This is clearly shown by a simple 
example: 


t = f(x + p cos a, y + p sin a). 


Using p and € as coordinates, the slope of the tangent to the curve att = z,p = 0 
is given by 


| _ 
(G p=0 = Dwf(x, y). 
Hence, the tangent has the equation 


& = z + pD f(x,y) = f(x, y) + p cos a fz(x, y) + p sin a fy(x, y). 
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u= fle») = os if «+ y240, 


u=0 if x=0,y=0. 


If we introduce polar coordinates this becomes 


u = — sin 20. 


T. 
2 


The first derivatives with respect to x and to y exist everywhere in the 
neighborhood of the origin and have the value 0 at the origin itself. 
These derivatives, however, are not continuous at the origin, for 


Us = y a _ as) — O- Š 
Vy Hy) V(x? + y?)® 
If we approach the origin along the x-axis, uz tends to 0, while if we 
approach along the y-axis, uz tends to 1. This function is not dif- 
ferentiable at the origin; at that point no tangent plane to the surface 
z = f (x, y) exists. For the equations /,(0, 0) = fy(0, 0) = 0 show that 
the tangent plane would have to coincide with the plane z = 0. But 
at the points of the line 0 = 71/4, we have sin 20 = 1 and z= 
f (x, y) = r/2; thus, the distance z of the point of the surface from the 
point of the plane does not, as must be the case with a tangent plane, 
vanish to a higher order than r. The surface is a cone with vertex at 
the origin, whose generators do not all lie in one plane. 


Exercises 1.5c 


1. Find the equation of the tangent plane to the surface defined by z = 
f(x, y) at the point P = (xo, yo) in each of the following cases: 


(a) f(x, y) = 3x? + 4y?, P = (0, 1) 

(b) f(x, y) = 2 cos (x — y) + 3 sin x, P= |7, 7) 
(c) f(x, y) = cosh (x + y), P = (0, log 2) 

(d) f(x, y) = vx? + y?2, P=, 2) 

(e) f(x, y) = er ss, P= (1, 7 

(£) f(x, y) = cos x e7”, P= (log 2, 1) 


r2+y2 
D fe y= fj edt, P=, 
(h) f(x, y) = ax? + bx? y+ cxy? + dy?, P = (1, 1), (a, b, c, d constants) 
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2. Show that all tangent planes to a surface z = y f(x/y) meet in a common 
point where f is any differentiable function of one variable. 


3. Show that the tangent plane to the surface S: z = f(x, y) at the point 
Po = (xo, yo) is the limiting position of the plane passing through the 
three points (xi, vi, 2:1), i = 0, 1, 2, of S where Pi = (x1, yı) and P: = 
(xe, y2) approach Po from distinct directions, making an angle not equal 
to 0° or 180°. 


4. Prove that the tangent plane to the quadric surface 
ax? + by? + cz?=1 
at the point (xo, yo, Zo) is 


axox + byoy + czoz = 1. 


d. The Differential of a Function 


As for functions of one variable, it is often convenient to have a 
special name and symbol for the linear part of the increment of a 
differentiable function u = f(x, y) which occurs in formula (14), 


Au = f(x + h, y + k) — f(x, y) = hfa(x, y) + Rfy(x, y) + evh? + R. 


We call this linear part the differential of the function, and write 


(15a) du = df(x, y) = En + k= oF Ax + 4 Ay. 


The differential, sometimes called the total differential, is a function 
of four independent variables, namely, the coordinates x and y of the 
point under consideration and the increments h and k of the inde- 
pendent variables. We emphasize again that this has nothing to do 
with the vague concept of “infinitely small quantities.” It simply 
means that du approximates to the increment Au = f(x + h, y + k) 
— f(x, y) of the function, with an error that is an arbitrarily small 
fraction £ of /h2 + k2, provided that h and k are sufficiently small 
quantities. For the independent variables x and y we find from (15a) 
that 


OY Ay 4 °Y Ay = Ay. 


Ox Ox 
dx = =< Ax + ax Ae 


Ay ay Ay = Ax and dy = 


Hence, the differential df(x, y) is written more commonly 


(5b) f(x, y) = Fede + Fh dy = falx, 3) de + fx, 9) dy. 
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Incidentally, the differential completely determines the first partial 
derivatives of f. For example, we obtain the partial derivative df/dx 
from df, by putting dy = 0 and dx = 1. 

We emphasize that the total differential of a function f(x, y) as the 
linear approximation to Af has no meaning unless the function is 
differentiable in the sense defined above (for which the continuity, 
but not the mere existence, of the two partial derivatives suffices). 

If the function f(x, y) also has continuous partial derivatives of 
higher order, we can form the differential of the differential df (x, y); 
that is, we can multiply its partial derivatives with respect to x and y 
by h = dx and k = dy, respectively, and then add these products. In 
this differentiation, we regard h and k as constants, corresponding 
to the fact that the differential df = hfx(x, y) + Rfy(x, y) is a function 
of the four independent variables x, y, h, and k. We thus obtain the 
second differential: of the function, 


5 (ea + oh 


d?f = d(df) = (bat d hht ay 


dy\o 


_ of 1, @?f OF yo 
= Fah +255 t 55k 


af x2 @?f a 2 2 
= zya ax + 205 y Ox dy + 5 5 dy?. 


Similarly, we may form the higher differentials 


d2f = ddf) = SE dx + 355 aL py dt dy + 850k dx dy? + zi d, 


dx? dy? 


dif = SE dxi + 4 SE de? dy + 6 5 oy - 


dx? dy 


a'f 3 IF 
+ 457 aya 2X ay” + By 3 dy*, 


and, as is easily shown by induction, in general 


1We shall later see (p. 68) that the differentials of higher order introduced formally 
here correspond exactly to the terms of the same order in the expansion of the 
function. 

2Traditionally, one writes the powers (dx)?, (dx), (dy)?, (dy)? of differentials simply 
as dx?, dx, dy?, dy. This is, of course, somewhat misleading, since they might be 
confused with d(x?) = 2x dx, d(x?) = 3x? dx, and so on. 
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r arf 


k| azn -r gyt 2%" AY” + eee +. pn dy", 


.. +| 
The last formula can be expressed symbolically by the equation 
n 
anf = (© dx + 2 zD) f 


where the expression on the right is first to be expanded formally by 
the binomial theorem, and then the terms 


ð nf xn _ of -1 anf n 
zyn dx ' Jan dy dx"-1dy,... > ayn dy 


are to be substituted for 
(2. dz} "F, (2. dz) = 5 dy) fse, (5 dy) r 


For calculations with differentials the rule 


d(fg) = f dg + g df 


holds good; this follows immediately from the rule for the differen- 
tiation of a product. 

In conclusion, we remark that the discussion in this section can 
immediately be extended to functions of more than two independent 
variables. 


Exercises 1.5d 


1. Find the total differentials for the following functions: 
(a) z = x?y? + 3xy3 — 2y4 


xy 
(b) z =x? + 2y? 
(c) z = log(x* — y’) 
@ z= +5 
(e) z = cos (x + log y) 
_*—y 
€) z= x+y 


(g) z= arc tan (x + y) 
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(h) z= xy 
G) w = cosh (x + y — 2) 


(j) w = x? — 2xz + y?. 

2. Evaluate the total differential of f(x) = x — y + (x? + y?)1/3, for x = 1, 
y = 2, dx = .1, dy = 23. 

3. Find d?f(x, y) for f(x, y) = e7? +¥?, 


e. Application to the Calculus of Errors 


The differential df = hfz + kfy is often used in practice as a 
convenient approximation to the increment of the function f(x, y), 
Af = f(x + h, y + k) — f(x, y) as we pass from (x, y) to (x + h, y + 
k). This use is exhibited particularly well in the so-called “calculus of 
errors” (cf. Volume I, p. 490). Suppose, for example, that we wish to 
find the possible error in the determination of the density of a solid 
body by the method of displacement. If m is the weight of the body in 
air and m its weight when submerged in water, then by Archimedes’s 
principle, the loss of weight (m — m) is the weight of the water 
displaced. If we are using the cgs (centimeter-gram-second) system 
of units, the weight of the water displaced is numerically equal to its 
volume and hence to the volume of the solid. The density s of the body 
is thus given in terms of the independent variables m and m by the 
formula s = m/(m — m). The error in the measurement of the density 
s caused by an error dm in the measurement of m, and an error dm 
in the measurement of m is given approximately by the total dif- 
ferential 


ðs 


ds = 2 dm m + zz dm. 


By the quotient rule, the partial derivatives are 


os __ im nd dso m, 
ðm (m-n ° ðm (m-— m)?’ 


hence, the differential is 


—m dm + m dm 
ds = A aa 
(m — m) 
Thus the error in s is greatest if dm and dm have opposite sign, say, 
if instead of m we measure too small an amount m + dm and instead 
of m too large an amount m + dm. For example, if a piece of brass 
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weighs about 100 gm in air, with a possible error 0.005 gm, and in water 
weighs about 88 gm, with a possible error of 0.008 gm, the density is 
given by our formula to within an error of about 


88-5 + 1073 + 100-8- 10-3 


T ~9+ 1078, 


or about 1 percent. 


Exercises 1.5e 


1. Find the approximate variation of the function z = (x + y)/(x — y), as x 
varies from x = 2 to x = 2.5, and y, from y = 4 to y = 4.5. 


2. Approximate the value of log [(1.02)/4 + (0.96)!/6 — 1]. 


3. The base length x and height y of a right triangle are known to within 
errors of h, k, respectively. What is the possible error in the area? 


4. If dz is the error of measurement in a quantity z, the relative error is 
defined as dz/z. Show that the relative error in a product z = xy is the 
sum of the relative errors in the factors. 


5. The acceleration g of gravity is to be determined by timing the fall in 
seconds of a body dropped from rest through a fixed distance x. If the 
measured time is t, we have g = 2x/t?. If x is about 1 m and t about .45 sec 
show that the relative error of measurement in g is more sensitive to a 
relative error in ¢ than a relative error in x. 


1.6 Functions of Functions (Compound Functions) and the 
Introduction of New Independent Variables 


a. Compound Functions. The Chain Rule 


Frequently a function u of the independent variables x, y is given 
in the form 


u = f&n...) 


where the arguments 6, n,. .. of f are themselves functions of x 
and y 


E = g(x, y), n= y(x, y)... 
We then say that 
(16) u = f&, n, . - . ) = fGlx, Y), W, y),...) = F(x, y) 


is a compound function of x and y (compare Volume I, pp. 52 ff.). 
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For example, the function 
(16a) u = F(x, y) = e% sin (x + y) 
may be written as a compound function by means of the relations 
(16b) u = f(§, n) = & sin n, 
where & = xy and n= x + y. Similarly, the function 
(16c) u = F(x, y) = log (xt + y*) + arc sin V1 — x? — y? 
can be expressed in the form 
(16d) u = f(§, n) = n arc sin &, 


where & = V1 — x? — y? and n = log (xt + y’). 
In order to make the concept of compound function meaningful we 


assume that the functions & = g(x, y), n = w(x, y), ... have the 
common domain R and map any points (x, y) of R into points 
(E, n, . . . ) for which the function u = f(&,n, . . . ) 1s defined, that 


is, into points of the domain S of f. The compound function 


u = f(x, y), W(x, y), .) = F(x, y) 


is then defined in the region R. 

A detailed examination of the regions R and S is often unnecessary, 
as in (16b), in which the argument point (x, y) can traverse the entire 
x, y-plane and the function u = e sin n is defined throughout the 
E, n-plane. On the other hand, (16d) shows the necessity for examin- 
ing the domains R and S in the definition of compound functions. 
For the functions § = V1 — x2 — y2 and ņ = log (x* + yt) are defined 
only in the region R consisting of the points 0 < x? + y? < 1, that is, 
the closed unit disk with center at the origin, the origin being deleted. 
Within this region we have |&| < 1,n < 0. The corresponding points 
(£, n) all lie in the domain of the function n arc sin &, and thus the 
compound function F(x, y) is defined in R. 

A continuous function of continuous functions is itself continuous. 
More precisely, if the function u = f(&, n, . . . ) is continuous in the 
region S, and the functions č = ¢(x, y), ņn = w(x, y), ... are 
continuous in the region R, then the compound function u = F(x, y) 
is continuous in R. 

The proof follows immediately from the definition of continuity. 
Let (xo, yo) be a point of R, and let Eo, no, . . . be the corresponding 
values of £, n, . . . . Now for any positive ¢ the absolute value of 
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the difference 


fE, n, e. .) — f(&o, no, soe .) 
is less than £, provided only that the inequality 


(E — Eo)? + (N — no)? + ° e.. <Ò 


is satisfied, where ô is a sufficiently small positive number. But by 
the continuity of d(x, y), w(x, y), . . . this inequality is satisfied if 


v(x — xo)? + (yY — yo)? < Y, 


where y is a sufficiently small positive quantity. This establishes the 
continuity of the compound function. 

Similarly, a differentiable function of differentiable functions is itself 
differentiable. This statement is formulated more precisely in the 
following theorem, which at the same time gives the rule for the 
differentiation of compound functions, the so-called chain rule: 


If é = g(x, y), n= y(x, y)... are differentiable functions of 
xand y in the region Rand if f(é,n,...) is a differentiable function 
of č, n, . . . in the region S, then the compound function 
(17) u = f(x, y), W(x, y), . . . ) = F(x, y) 


is also a differentiable function of x and y; its partial derivatives are 
given by the formulae 


Fz = fe ġa + fn Waters, 


18 
(18) Fy = fe dy +fn Wy tees, 


or, briefly, by 


(19) We = Us Ge tn Me ts, 
Uy = Ue Ey + Un Nyt. °°, 


Thus, in order to form the partial derivative with respect to x, we 
must first differentiate the compound function with respect to each of 
the variables €, n, . . . , multiply each of these derivatives by the 
derivative of the corresponding variable with respect to x, and add all 
the products thus formed. This is the generalization of the chain rule 
for functions of one variable discussed in Volume I (p. 218). 

Our statement can be written in a particularly simple and sug- 
gestive form if we use the notation of differentials, namely, 
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(20) du = u; dé + undyn +t >» >. 
= ue (§ dx + Ey dy) + Un (nz dx + ny dy) + >» 
= (Ug Er + Un Ng + » + »)dx + (UzÉy + Uny + + + -)dy 


= Uz dx + Uy dy. 
This equation shows that we obtain the linear part of the increment 
of the compound function u = f(E, n,...) = F(x, y) by first 
writing this linear part as if €, n, . . . were the independent varia- 
bles and then replacing dé, dn,... by the linear parts of the 
increments of the functions € = d(x, y), n= w(x, y),.... This 


fact exhibits the convenience and flexibility of the differential no- 
tation. | | 

In order to prove our statement (18) we have merely to make use of 
the assumption that the functions concerned are differentiable. From 
this it follows that corresponding to the increments Ax and Ay of the 
independent variables x and y the quantities &, n,... change by 
the amounts 


(20a) AE = Ér Ax + Ey Ay + 1V(Ax)? + (Ay)? 

(20b) An = Nz Ax + ny Ay + E2V(Ax)? + (Ay), . . . 

where the numbers £1, £2, . . . tend to 0 for Ax — 0 and Ay—0 or for 
(Ax) + (Ay)? 0. The derivatives ør, dy, Wx, Wy are taken for 
the arguments x, y. Moreover, if the quantities &, ,... undergo 
changes AE, An,..., the function u=/f(E,n,...) changes by 


the amount 
(21) Au = feA& + fndn + + + + +8V(AE)? + (An)? ++ © © 


where the quantity 5 tends to 0 for A% —> 0 and An > 0, and ft, fn 
have the arguments €, n. Using here for AE, An, . . . the amounts 
given by formulae (20a, b) corresponding to increments Ax and Ay 
in x and y, we find an equation of the form 


(22) Au = (fps + fya ++ + +) Ax + (fey + fyuy ++ + +) Ay 
+ €/(Ax)? + (Ay)?. 


Here, for Ax =p cos a, Ay=p sin a, p = V(Ax)? + (Ay)?, the 
quantity £ is given by 
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€ = Erfe + ef, + SVGz cos a + dy sin a + £1)? + (Yz cos a 
+ Wy sin a + £) ++ © > 


For p > 0 the quantities Ax, Ay, £1, €2 tend to 0 and, hence, so do 
AE, An, and 5. On the other hand, fe, fn. . . , $2, dy, Vz, Wy, . . . stay 
fixed. Consequently, 


lim £ = 0. 
p-+0 


It follows from (22) that u considered as a function of the independent 
variables x, y is differentiable at the point (x, y) and that du is given 
by equation (20). From this expression for du we find that the partial 
derivatives uz, Uy have the expressions (19) or (18). 

Clearly this result is independent of the number of independent 
variables x, y,. ... It remains valid, for example, if quantities 
E.n, . . . depend on only one independent variable x, so that u is a 
compound function of the single variable x. 

To calculate the higher partial derivatives, we need only dif- 
ferentiate the right-hand sides of our equations (19) with respect to x 
and y, treating fe, fn,...as compound functions. Confining 
ourselves for the sake of simplicity to the case of three functions 
E, n, and C, we obtain! 


(23a) Ure = febr? + formna? + fede? + 2fenExns + 2Zarnale 
+ QfecExbe + febre + frNer + fiber, 
(23b) Uy = feeGeSy + fannany + fecGeby + fen(Ecny + Synz) 


+ fnc(neby + nye) + fec(Exy + SySz) 
+ feScy + fairy + fSzy, 


(23c) Uyy = feby? + farmy? + feby? + 2fenEyny + 2facnySy 
+ 2fecEySy + feSyy + fuy + febuv. 


Exercises 1.6a 


1. Find all partial derivatives of first and second order with respect to x 
and y for the following: 
1 

1+y 

1It is assumed here that f is a function of &, n of class C? and that &, n, Ç are 


functions of x, y of class C?. It follows that the compound function u of x and y again 
is of class C2. 


(a) z = u log v, where u = x2, v= 
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(b) z = e”, where u = ax, v = cos y 
XY 
x — y 


(c) z = u arc tan v, where u = v = xy +y — x 
(da) z = g (x? + y?, e*¥) 
(e) z = tan (x arc tan y). 


2. Calculate the partial derivatives of the first order for 
1 


(a) w= V(x? + y2 + 2xy cos z) 


. x 
(b) w = arc sin z4+y y? 

(c) w = x? + y log (1 + x? + y? + 2?) 
(d) w = arc tan V(x + yz) 


3. Calculate the derivatives of 


(a) z =x0®, 


o- 


x 
4. Prove that if f(x, y) satisfies Laplace’s equation 
02 


Ix t Gye TO 


x J 
so does ¢(x, y) =f EET aa) 
5. Prove that the functions 
(a) f(x, y) = log /x? + y2, 


1 
(b) g(x, y, z) = Jary T’ 


(c) h(x, y, z, w) = XEF yF eb we? 
satisfy the respective Laplace’s equations, 
(a) fzz + fyy = 0, 


(b) Ezz + Eyy + Sez = 0, 
(c) Rez + hyy + hee + hww = 0. 


Problems 1.6a 


1. Prove that if f(x, y) satisfies Laplace’s equation 
of , of 
0x2 — Oy? 
and if u(x, y) and v(x, y) satisfy the Cauchy-Riemann equations, 


ðu dv ou Ov 
Ox Oy Oy Ox 


? 
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then the function ¢(x, y) = f(u(x, y), v(x, y) ) is also a solution of 
Laplace’s equation. 
2. Prove if z = f(x, y) is the equation of a cone, then 


fezfyy — fay” = 0 
3. Let f(x, y, z) = g(r), where r = Vx? + y? + 2?, 
(a) Calculate frz + fyy + fez. 


(b) Prove that if frs + fuy + fee = 0, then f (x, y, z) = = + b, where a 


and b are constants. 
4. Let f(x1, x2, . . . , xn) = g(r), where 


r= Vel F aF o o F an 
(a) Calculate fzızı + fzrzzrz + ° * * + fznzn (compare 1.4.a, Exercise 10). 
(b) Solve frizi + frzz2 Hoe oce + fanzn = 0. 


b. Examples} 
1. Let us consider the function 
u = exp (x? sin?y + 2xy sin x sin y + y?). 
We put 
U = ÈR, C= x? sin’y, n= 2xy sin x sin y, € = y2 
and obtain 


Es = 2x sin?y, Ns = 2y sin x sin y + 2xy cos x sin y, Cz = 0; 
Éy = 2x? sin y cos y, Ny = 2x sin x sin y + 2xy sin x cos y, Cy = 2y; 


Ut = Un = UC = estati, 
Hence 


Uz = 2 exp (x? sin?y + 2xy sin x sin y + y?) (x sin?y + y sin x sin y 


+ xy cos x sin y) 
and 


Uy = 2 exp (x? sin?y + 2xy sin x sin y + y?) (x? sin y cos y 


+ x sin x sin y + xy sin x cos y + y). 


1We note that the following differentiations can also be carried out directly, without 
using the chain rule for functions of several variables. 
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2. For the function 
u = sin (x? + y?) 
we put € = x? + y? and obtain 


Uz = 2x cos (x? + y?), Uy = 2y cos (x? + y?) 
Une = — 4x? sin (x? + y?) + 2 cos (x? + y?), 
Ury =— 4xy sin (x? + y?) 
Uyy = — 4y? sin (x? + y?) + 2 cos (x? + y?). 


3. For the function 
u = arc tan (x? + xy + y?), 


the substitution § = x?, n = xy, & = y? leads to 


le = EY 
"OL + (x? + xy + y?)?’ 
uy x + 2y 


~ 1+ (2 + xy + y?)?” 


c. Change of the Independent Variables 


The application of the chain rule (19) to a change of the inde- 
pendent variables is particularly important. For example, let u = 
f(E, n) be a function of the two independent variables €, n, which 
we interpret as rectangular coordinates in the §,y-plane. We can 
introduce new rectangular coordinates x, y in that plane (see Volume 


I, p. 361) related to €, n by the formulae 


(24a) E = ux + Biy, n= ax + Poy 
or 
(24b) x= a% + am, y= P16 + Ban 
Here, 
Qı = COS Y, az = — sin Y, Bi = sin y, B2 = cos Y, 


where y denotes the angle the positive €-axis forms with the positive 
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x-axis. The function u = f(é, n) is then “transformed” into a new 
function 


u = f(E, n) = f(aix + Bry, azx + Boy) = F(x,y), 


which is formed from f(E, n) by a process of compounding as de- 
scribed on p. 53. We say that the dependent variable u is “referred 
to the new independent variables x and y instead of € and n.” 

The rules of differentiation (19) on p. 55 at once yield 


(25) Ur = UE + Unt, Uy = Uebi + Une, 


where Uz, Uy denote the partial derivatives of the function F(x, y), 
and uz, Uun the partial derivatives of the function f(&, n). Thus the 
partial derivatives of any function are transformed according to the 
same law (24b) as the independent variables when the coordinate axes 
are rotated. This is true for rotation of the axes in space as well.} 

Another important change of the independent variables is that 
from rectangular coordinates (x, y) to polar coordinates (r, 0). The 
polar coordinates are connected with the rectangular coordinates by 
the equations 


(26a) x = r cos l, y =rsin 8 
— /y2 ad — x _ . y 
(26b) r= vx + y?, 0 = arc cos Jet arc sin Vea ye 


Referring a function u = f(x, y) to polar coordinates, we have 
u = f(x, y) = f(r cos 9, r sin 0) = Fr, 9), 


and u appears as a compound function of the independent variables 
r and 9. Hence, by the chain rule (19) we obtain 
Ux = Urre + Uor = Ur ~- ue “3 = ur cos 0 — up T, 


27 
(27 cos 0 


Uy = Urry + uoy = ur = + uo -7 = Uur sin 8 + uo 
These yield the useful equation 
(28) 24 u2 = u? l 
Ux Uy? = Ur + g Uo", 


1But, in general, not for other types of coordinate transformation. 
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By the rules (23a, b, c), the higher derivatives are given by 


o . 
sin? ð cos 8 sin 0 
Urxr = Urr cos? 6 + uoo r? — 2Uro ee 
o . 
sin? 0 cos 8 sin 8 
+ Ur ——— + 2ue—_ 5 __, 
r r 
, cos 9 sin 8 cos? § — sin? 6 
Ury = Ury = Urr cos 9 sin 8 — uee z + Ur 
sin? 0 — cos? 0 sin 8 cos 0 
+ Ug r2 — Ur r , 
> . 
. cos? 0 cos 8 sin 8 
Uyy = Urr Sin? O + uoo „2 + 2Uurg ~~ 
cos? 0 cos 9 sin 0 
Ur ~ 2ue a 


This leads to the expression in polar coordinates of the so-called 
Laplacian Au, which appears in the important “Laplace,” or ‘‘po- 
tential,” equation Au = 0 (see p. 33): 


1 1 
(29) Au = Urr + Uyy = Urr + Uoo 2 + Ure 


= 3] z| m a 
~ 721 ar\" ar 562 


Conversely, we can apply the chain rule to express u, and uo in terms 
of uz and uy. We find in this way 


(30a) Ur = UxXy + UyYr = Ur COS 9 + Uy sin 9, 
(30b) Ug = UrXe + UyVo = — Uzr sin ð + uyr cos ð. 


We can also derive these equations by solving relations (27) for ur 
and ue. Incidentally, equation (30a) has been encountered already 
as the expression for the derivative of u in the direction of the radius 
vector r on p. 45. 

In general, whenever we are given relations defining a compound 
function, 


u =f, n... ), 
E = (x,y) n= yl, y)... 


we may regard these as referring u to new independent variables x, y 
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instead of €, n,.... Corresponding sets of values x, y and 6, 
nN, ... of the independent variables assign the same value to u, 
whether it is regarded as a function f(6,n...)of &,n,...orasa 
function F(x, y) = f(@(x, y), w(x, y), . . . ) of x, y. 

In differentiations of a compound function u = f(E, n,. . . ), we 
must distinguish clearly between the dependent variable u and the 
function f(E, n, . . . ), which assigns values of u to values of the 
independent variables &, n,.... The symbols of differentiation 
Uz, Un, . . . have no meaning until the functional connection between 
u and the independent variables is specified. When dealing with 
compound functions u=/f(Eé, n,...)= F(x, y), therefore, one 
really ought not to write we, Un or Ur, Uy but instead f:(&, n), 
fr(E, n) or Fz(x, y), Fy(x, y), respectively. Yet, for the sake of brevity 
the simpler symbols uz, Un, Uz, Uy are often used when there is no risk 
of confusion. The chain rule is then written in the form 


(31) Uc = Webs + Unz, Uy = Utby + Uny, 


which makes it unnecessary to give “names” f or F for the functional 
relation between u and €, n or x, y. 

The following example illustrates the fact that the derivative of a 
quantity u with respect to a given variable depends on the nature of 
the functional connection between u and all of the independent 
variables; in particular, it depends on which of the independent 
variables are kept fixed during the differentiation. With the “identity 
transformation” & = x, n =y the function u = 26 + n becomes 
u = 2x + y, and we have ur = 2, uy = 1. If, however, we introduce 
the new independent variables 6 = x (as before) and € + n = v, we 
find that u = x + v, so that ur = 1, u» = 1. Thus, differentiation 
with respect to the same independent variable x gives different results 
for different choices of the other variable. 


Exercises 1.6c 


1. Let u = f(x, y), where x = r cos 0, y = r sin 0. Express vuz? + uy? in 
terms of ur and uo. 

2. Prove that the expression frz + fyy is unchanged by rotation of the 
coordinate system. 

3. Show that the linear changes of variables x = «č + Bn, y = yë + Sy 
transform the derivatives frz(x, y), fzy(x, Y), fyy(x, y) by the same rule 
as the coefficients a, b, c, respectively, of the polynominal 


ax? + 2bxy + cy? 
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4. 


5. 


Given z = r? cos 9, where r and 0 are polar coordinates, find Zz and 
Zy at the point 6 = 7/4, r = 2. Express zr and zo in terms of zz and zy. 

By the transformation §& = a + ax + By, n = b — Bx + ay, in which 
a, b, «, B are constants and «a? + R? = 1, the function u(x, y) is trans- 
formed into a function U(é, n) of & and 7. Prove that 


Uze Unn — Uen = Urz Uyy — Ury? 


. Show how the expression Ty — Tzs is transformed under the intro- 


duction of a variable z = x/,/y in place of y. 


. (a) Prove that the function 


h(x, y) =f (x — y) + g(x + y) 


for any twice continuously differentiable functions f, g, satisfies the 
condition hex = Ayy. 
(b) Similarly, show that 


A(x, y) = f(x — iy) + g(x + iy), 
with i? = —1, satisfies the condition Hzz = — Hyy. 


Problems 1.6c 


. Transform the Laplacian Uzr + Uyy + Uzz into three-dimensional polar 


coordinates r, 9, ¢ defined by 


x = r sin 6 cos ¢ 
y =r sin 9 sin ¢ 
z = r cos 9. 


Compare with 1.6.a, Problem 3. 


. Find values a, b, c, d such that under the transformation & = ax + by, 


n = cx + dy, where ad — bc + 0, equation Afrz + 2Bfzy + Cfyy = 0 
becomes 


(a) fee + fm = 0 
(b) fen =O (A,B,C, constants) 
Is this always possible? 


1.7 The Mean Value Theorem and Taylor’s Theorem for 
Functions of Several Variables 


a. Preliminary Remarks About Approximation by Polynomials 


We have already seen in Volume I (Chapter V, p. 451) how a 


function of a single variable can be approximated in the neighbor- 
hood of a given point with an accuracy higher than the nth order 
by means of a polynomial of degree n, the Taylor polynomial, provided 
that the function possesses derivatives up to the (n + 1)th order. 
Approximation by means of the linear part of the function, as given 
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by the differential, is only the first step toward this closer approxi- 
mation. In the case of functions of several variables, for example, of 
two independent variables, we may also seek an approximate rep- 
resentation in the neighborhood of a given point by means of a 
polynomial of degree n. In other words, we wish to approximate 
f(x + h, y + k) by means of a “Taylor expansion” in terms of the 
increments Ah and k. 

By a simple device this problem can be reduced to one for functions 
of only one variable. Instead of just considering f(x + h, y + k), we 
introduce an additional variable ¢ and regard the expression 


(31) F(t) = f(x + At, y + kt) 


as a function of t, keeping x, y, h, and k fixed for the moment. As t 
varies between 0 and 1, the point with coordinates (x + ht, y + kt) 
traverses the line segment joining (x, y) and (x + h, y + k). The 
Taylor expansion of F(t) according to powers of t will yield for t = 1 
an approximation to f(x + h, y + k) of the desired kind. 

We begin by calculating the derivatives of F(t). If we assume 
that all the derivatives of the function f(x, y) that we are about to 
write down are continuous in a region entirely containing the line 
segment, the chain rule (18) at once gives! 


(32a) F(t) = hfs + kfy, 
(32b) F "(t) = hf. LX + 2hkf. TY + k?fyy, 


and, in general, we find by mathematical induction that the nth 
derivative is given by the expression 


n 


(B2) FO = hfaa + (Nhe kfaa-iy + (3) A Rf n-2y2 
+ o o o + k”f yn, 
1We have from the chain rule 


F'O = Ste + ht, y + ht) = hfe, n) + Ra, nò) 


where E=x+ht, n=y+ kt. We write here fz(x + ht, y + Rt) for fe(x + ht, 
y + kt) since (again by the chain rule) 


2 Ræ + ht, y + kt) = felx + ht, y + ki) 


if x, y, h, k are considered independent variables. 


66 Introduction to Calculus and Analysis, Vol. II 


which, as on p. 51, can be written symbolically in the form 
ð Q\” 
aH = |h— + k> 
F(t) (a ay k >) f. 


In this formula the symbolic power on the right is to be expanded by 
the binomial theorem and then the powers of 0/dx, d/dy multiplied 
by f are to be replaced by the corresponding nth derivatives 0"f/dx”, 
a"flax"—ldy, . . .. In all these derivatives the arguments x + ht and 
y + kt are to be written in place of x and y. 


Exercises 1.7a 


1. For F(t) = f(x + ht, y + kt) find F’(1) for: 
(a) f(x, y) = sin (x + y) 


—9 
(b) f(x, y= > 


(c) f(x, y) = x? + xy? — y* 
2. Find the slope of the curve z(t) = F(t) = f(x + ht, y + kt) at t = 1, for 
x=0,y=1, h = 4» k = 4, and 
(a) f(x, y) = x? + y? 
(b) f(x, y) = exp [x? + (y —1)°] 
(c) f(x, y) = cos r (y — 1) sin rx? 


b. The Mean Value Theorem 


Before taking up higher order approximations by polynomials, we 
derive a mean value theorem analogous to the one we already know 
for functions of one variable. This theorem relates the difference 
fix + h, y + k) —f(x, y) to the partial derivatives fs and fy. We 
expressly assume that these derivatives are continuous. On applying 
the ordinary mean value theorem to the function F(t) we obtain 


F(t) — FO) _ y 


where 0 is a number between 0 and 1; using (31) and (82a) it follows 
that 


fix + ht, y +) fe») = hfa(x + Oht, y + Okt) +kfy(x + Oht, y + Okt). 
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Setting t = 1, we obtain the required mean value theorem for functions 
of two variables in the form 


(33) f(x + h, y + k) — f(x, y) 
= hfa(x + 9h, y + OR) + Rfy(x + Oh, y + OR) 


Thus, the difference between the values of the function at the points 
(x+ h, y + k) and (x, y) is equal to the differential at an intermediate 
point (¢, n) on the line segment joining the two points. It is worth 
noting that the same value of 8 occurs in both fz and fy. 

Just as for functions of a single variable (Volume I, p. 178), the 
mean value theorem can be used to obtain a modulus of continuity for 
a function f(x, y) and, more precisely, to show that a function f as 
above is Lipschitz continuous. In order to apply the mean value 
theorem we must be able to join two points by a straight line segment 
along which f is defined. Assume then that the domain R of f(x, y) 
is convex, that is, that the line segment joining any two points of R 
lies completely in R. Let f be continuously differentiable in R and 
let M be a bound for the absolute value of the derivatives of f: 


| fa(x, y| < M, | fu(x, y) < M 


for (x, y) in R. Then formula (33) can be applied and yields the in- 
equality 


(34) I(x + h, y + k) — f(x, YIS IAl Ifl, WI+IRIAG, I 
<|h|M+|k|M<2M VAR k 


Hence, the numerical value of the difference in the values of f at two 
points. whose distance p = Vh? + k? does not exceed a fixed multiple 
of the distance (namely, 2Mp). This is exactly what is meant by 
Lipschitz continuity of f. In particular we have 


f(x + h, y + k) — f(x, y)|< e 


for vh? + k? < ¢/2M. Thus f is uniformly continuous in R with the 
“modulus of continuity” 5 = ¢/2M. 

The following fact, the proof of which we leave to the reader, is a 
simple consequence of the mean value theorem. A function f(x, y) 
whose partial derivatives fz and fy exist and have the value 0 at every 
point of a convex set is constant. 
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Exercises 1.7b 
1. Interpret the mean value theorem geometrically. 
2. Find a value 9 for which 
hfx(x + 9h, y + 9k) + kfy(x + 9h, y + 9k) 
= f(x + h, y + k) — f(x, y) 
in each of the following cases: 
(a) f(x, y) = xy + 3°, x=y=0,h=3 k =i 
(b) f(x, y) = sin z (x + y) x =y =3 h= k= 
3. Show that there is a number 9, 0 < 0 < 1 such that 
2 = cos $ + sin| 51 — J 
using the mean value theorem for the function 
f(x, y) = sin tx + cos rty. 


4. Derive the mean value theorem for a function f(x, y, z) of three variables. 
5. Find a number 9, 0 < 0 < 1, for which 


1 1 0 0 1 0 0 1 0 9 
where 


(a) f(x, y, z) = xyz 
(b) f(x, y, z) = x2 + y2 + 2xz 


Problems 1.7b 


1. Let the domain of f(x, y) be a polygonally connected region; that is, 
suppose that any two points P, Q of the domain can be connected within 


the domain by a sequence of segments PoPi, PiP2,..., Pn-ı Pn, where 
Po = P and Pn = Q. Prove that if the partial derivatives fz and fy have 
the value 0 at every point of the domain, then f is constant. 


c. Taylor’s Theorem for Several Independent Variables 


If we apply Taylor’s formula with Lagrange’s form of the remainder 
(cf. Volume I, p. 452) to the function F(t) = f(x + ht, y + kt), use the 
expressions (32a, b, c) for the derivatives of F, and put t= 1, we 
obtain Taylor’s theorem for functions of two independent variables, 


(35) æ+ h, y+ k)=f(x, y) + {hfalx, y) + hfy(x, y) 


+ 2) {h°fca(x, y) + 2hkfey(x, Y) + kfyr(x, y)} 
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1 N\ 7 n 
tereti h"fan(x, y) + 1 h”-ikfan-1 (x, y) 


po e o + Rf n(x, y)| + Ra, 


where Rn denotes the remainder term 


(36) Rn wnri(x + Oh, y + Ok) +++ > 


1 
= mp O” 
+ ntl fyn+ u(x + 8h, yo 8k)} ’ 


where 0< 0 < 1. The increment f(x + h, y + k) — f(x, y) is thus 
written as a sum of homogeneous polynomials of degree 1, 2,..., 
n+ 1, which, apart from the factors 


a 4 _14 
11? Qt? nl? (n1)! 


are the first, second, . . ., nth differentials 
df = hfs + Ry = (ha, + ka) f 


d2f = (n 2 + k= >) f= h2fex + 2hRf ry + k?fyy, 


anf = (ho +k 5) f= hfe + (he aefnay t e e + Ryn 
of f(x, y) at the point (x, y) and the (n + 1)th differential d”+1 f at an 


intermediate point on the line segment joining (x, y) and (x + A, 
y + k). Hence, Taylor’s theorem can be written more compactly as 


(37) flx+ h, y + k) = f(x, y) + af(x, y) + i defle, y) tee 


+ drf(x, y) + Rn, 


+1 
(38) Rn =n iot f(x + 0h, y + 9k), 0<0<1. 
In general the remainder Rn vanishes to a higher order than the 
term d”f just before it; that is, as h-0 and k—0, we have Rn = 
o{v(h? + k?)"}, 
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From Taylor’s theorem for functions of one variable the passage 
(n- 0c) to infinite Taylor series led us to the expansions of many 
functions in power series. With functions of several variables such a 
process, even when possible, is in general too complicated. For us the 
importance of Taylor’s theorem lies rather in the fact that the incre- 
ment f(x + h,y + k) — f(x, y) of a function is split up into increments 
df, d*f,.. . of different orders. 


Exercises 1.7c 


1. Find the polynomial of second degree that best approximates sin x sin y 
in the neighborhood of the origin. 

2. For f(x, y) = x? + 4y?x, approximate the value of f(2.1, 2.9). 

3. For f(x, y) = x/y + y/x, estimate the error in approximating the value 
of f(.9, .9) by f(1, 1). 

4. Expand the function f(x + h, y + k) in powers of h, k, for 


(a) f(x, y) = x? — 2x?y + y? 


(b) f(x, y) = cos (x + 2y) at x = 0, y 


ola 


(c) f(x, y) = x4y + 2y?x — V3x2. 


5. Expand f(x, y, z) = xyz? in powers of x, y — 1, z + 1. 
6. Obtain the first few terms of the Taylor expansions of the following 
functions in a neighborhood of the origin (0, 0): 


(a) z = arc tan aD D (£) z= log (1 — x) log (1 — y) 
(b) z = cosh x sinh y (g) z= ezv? 
(c) z = cos x cosh (x + y) (h) z= cos (x + y) e722 
(d) z = e? cos y (i) z= cos (x cos y) 
_ sin x a ce 
(e) z = Cos y (j) z= sin (x? + y?) 


7. Estimate the error in replacing cos x/cos y by 


1 T 
—Ż(y2?— nm 
z y2?) for lx|, ly|< 6 


Problems 1.7c 


1. Find the Taylor series for the following functions and indicate their 
range of validity. 


@) 745 
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(b) e%+¥, 
2. Show that the law of cosines in spherical trigonometry, 
cos z = cos x cosy + sin x sin y cos 9, 
reduces to the euclidean law of cosines, 
z? = x? + y? — 2xy cos 9 


in the neighborhood of the origin. 
3. If f(x, y) is a continuous function with continuous first and second 
derivatives, then 


2h, e-1/2h) — 2f (h, e- 1/4) + f(0, 0 
fes(0, 0) = lim Oh, T Ag 2 f0. 0) 


4. Prove that the function f(x, y) = exp (— y2 + 2xy) can be expended in a 
series of the form 


ua A(x) n 
2, n! 7? 


that converges for all values of x and y and that the polynominals Hn(x), 
the so-called Hermite polynomials, satisfy 


(a) Hn(x) is a polynomial of degree n. 
(b) Hn’(x) = 2nHn-1(x) 

(c) Hn+1 — 2xHn + 2nHn-1 = 0 

(d) Hn” — 2xH,’ + 2nHn = 0. 


1.8 Integrals of a Function Depending on a Parameter 


The concept of multiple integral of a function of several variables 
will be taken up in Chapters IV and V. For the moment we shall only 
study the single integrals arising in connection with such functions. 


a. Examples and Definitions 


If f(x, y) is a continuous function of x and y in the rectangular 
region a <x <f,a<y < b, we may think of the quantity x as fixed 
and integrate the function f(x, y), considered as a function of y alone, 
over the interval a < y < b. We thus arrive at the expression 


f? fa, y) ay 


which still depends on the choice of the quantity x. Thus, we are con- 

sidering not just one integral but the family of integrals |” f(x, y) dy 
. e a 

obtained for different values of x. The quantity x, which is kept fixed 
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during the integration and to which we can assign any value in its 
interval, we call a parameter. Our ordinary integral therefore appears 
as a function of the parameter x. | 

Integrals that are functions of a parameter frequently occur in 
analysis and its applications. For example, as the substitution xy = 
u readily shows, we have 


f xdy ; 
o yI — xy? are sin 


for —1 < x < 1. Again, in integrating the general power function we 
may regard the exponent as a parameter and write accordingly 


firo- 
oY FX eT? 


where we assume that x > —1. 

We can represent the region of definition of the function f(x, y) 
geometrically and consider the parallel to the y-axis corresponding to 
the fixed value of the parameter x, as in Fig. 1.15. We obtain the func- 
tion of y that is to be integrated by considering the values of the 
function f(x, y) as a function of y along the line of intersection AB 
of the parallel with the rectangle. We may also speak of integrating 
the function f(x, y) along the segment AB. 


Figure 1.15 


This geometrical point of view suggests a generalization. If the 
domain of definition R of the function f(x, y) has the shape shown in 
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Figure 1.16 


Fig. 1.16. such that any parallel to the y-axis cuts the boundary 
in at most two points, then for a fixed value of x we can again 
integrate the values of the function f(x, y) along the line AB in which 
the parallel to the y-axis intersects the region R. The initial and final 
points of the interval of integration will themselves vary with x. We 
then have to consider an integral of the type 


(39) f° fe, y)dy = Fo), 


W1(x) 


that is, an integral with the variable of integration y in which the 
parameter x is present both in the integrand and in the limits of 
integration. If we represent the function f(x, y) by the surface 
z = f(x, y) in x, y, z-space, then for a positive function f we can 
consider the cylinder with generators parallel to the z-axis having 
as its base the domain R of f in the x, y-plane and bounded on 
top by the surface z = f(x, y). A fixed value of x corresponds to a 
plane parallel to the y, z-plane, which intersects the solid cylinder in 
a certain plane region. The area of that region is given by the integral 
in formula (39). For example, the integral 


vie JIZ dy 


~/ 1-22 
represents the area of the intersection of the hemisphere 
0<z< V1—x?—¥? 


with a plane x = constant. 
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b. Continuity and differentiability of an integral with respect 
to the parameter 


The integral 
b 
F(x) = |. fæ, y) dy 


is a continuous function of the parameter x, fora < x < B, if f (x, y) 
is continuous in the closed rectangle R given byaxx<Baxy<b. 
For 


F(x + h) — Fœ | =| S? (fee + hy) = fee») dy 
< f? +h») = fa »)| dv. 


In virtue of the uniform continuity of f(x, y), for sufficiently small 
values of h the integrand on the right, considered as a function of 
y, may be made uniformly as small as we please, and the statement 
follows immediately. 

We next investigate the possibility of differentiating F(x). We first 
consider the case in which the limits of integration are fixed and as- 
sume that the function f(x, y) has a continuous partial derivative 
fz in the closed rectangle R.! We shall prove that instead of first in- 
tegrating with respect to y and then differentiating with respect to 
x we may reverse the order of these two processes: 


THEOREM. If in the closed rectangle axx<f, aSyZb the 
function f(x, y) is continuous and has a continuous derivative with 
respect to x, we may differentiate the integral with respect to the 
parameter under the integral sign, that is, 


(40) © Fay = Ef’ fee, 9) dy = J” fel, y) dy. 


Moreover, F'(x) is a continuous function of x. 

Before proving this theorem, we remark that it yields a simple 
proof of the fact (already established on p. 37) that in the formation 
of the mixed derivative gry of a function g(x, y) the order of differ- 
entiation can be changed, provided that gy and gzy are continuous and 
gx exists. For if we put f(x, y) = gy(x, y), we have 


1This means that fz exists in the open rectangle and can be extended into the closed 
rectangle as a continuous function (see. p. 42). 
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g(x, y) = a(x, a) + f° f(x, n) dn. 


Since f(x, y) has a continuous derivative with respect to x in the 
rectangle a < x < B, a < y < b, it follows that 


ga(x, y) = ga(x, a) + f f(x, n) dn, 
and therefore by the fundamental theorem of calculus 


Euzhx, y) = fa(x, y). 


Since also fz(x, y) = Zzy(x, y) from the definition of f, we see that 
Syx = Ezy. 

Proor. If both x and x + h belong to the interval a < x < B, 
we can write 


F(x + h) — F(x) =[" fx + h, y)dy — f’ f, y)dy 
= f? Uf + h, y) — fx, May. 


Since we have assumed that f(x, y) is differentiable with respect to 
x, the mean value theorem of differential calculus in its usual form 
gives 


Moreover, since the derivative fz is assumed to be continuous in the 
closed rectangle and therefore uniformly continuous, the absolute 
value of the difference 


is less than any positive quantity £ for all h with |h| <6 where 
5 = (£) is independent of x and y. Thus, 


F(x + h — F(x) | f? falx, y) dy! 


1Here the quantity 0 depends on y and may even vary discontinuously with y. This 
does not matter, for by the equation fz(x + 0h, y) = h1 [f(x + h, y) — f(x, y)] we 
see at once that fz(x + 9h, y) is a continuous function of x and y and is therefore 
integrable. 
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b b 
=|" fale + Oh, y) dy — f° fala, 9) dy 
a a 
b 
< | £ dy = «(b — a), 
a 


for |h| < (£), provided h 4 0. This means, however, that the re- 
lation 


lim F(x + h) — F(x) 


h>0 - — J, fx, y) dy = F(x) 


holds. This proves the existence of F’(x) and formula (40). The con- 
tinuity of F’ follows from that of the integrand fz(x, y) (see p. 74). 
In a similar way we can establish the continuity of the integral and 
the rule for differentiating the integral with respect to a parameter 
when the parameter occurs in the limits of integration. 
For example, if we wish to differentiate 


w2(%) 


F(x) = f, a f(x, y) dy, 
we start with the expression 
F(x) = |" f(x, y) dy = gu, v, x), 


where u = wi(x), v = y(x). Here we assume that yı(x) and w(x) 
have continuous first derivatives in an interval a < x < B and that 


a < Wi(x) < y(x) < b 


fora < x < B. Let, moreover, f(x, y) and f(x, y) be continuous in 
the set 


axsxx<ß, azsysob. 


The function ¢ of the three independent variables u, v, x is defined 
then for 


asx<B, asusb, asusb. 


Moreover, it has continuous partial derivatives, since by formula (40) 


f(u, v, x) = 2 f ” f(x, y) dy = f, " falx, y) dy 
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and by the fundamental theorem of calculus (Volume I, p. 185) 
0 fv 
galu, vx) = 5> |” fx, y) dy = fle, v) 
0 fv Oo fu 
alu, v, x) = |. fix, y) dy =a f, Ray) dy = — fix, u). 


We can apply the chain rule of differentiation (18) p. 55 to the 
compound function 


F(x) = ¢lyr(x), wa(x), x] 
and find 
F(x) = duyr'(x) + oye (x) + ge. 
This proves the existence of a continuous derivative of F(x) for 


a<x< i and yields the formula 


(41) £ [fla 9) dy 


y(x) 


wolT) 


falx, y) dy — wr'(x) f(x, wi(x)) + y(x) f(x, Wo(x)). 


y(x) 


Taking, for example, for F(x) the function 


F(x) = fr sin (xy) dy 
we obtain 


oe) =f y cos (xy) dy + sin (x?). 


For the example 


xd 
F(x) = f WF = = arc sin x, 


for — 1< x< +1, we obtain the relation 


ae 1 
o V1 — xy VI- 


F(x) = 


as the reader may verify directly. 
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Other examples are given by the sequence of integrals 


(42) Fala) =f E 2" fy dy, F = ff) dy, 


where n is any positive integer and f(y) is a continuous function of 
y alone, in the interval under consideration. Since the expression 
arising from differentiation with respect to the upper limit x vanishes, 
rule (41) yields the recursion formula 


Fn (x) = Fn-1(%) 
for n = 1, 2, 3,. . . . Since Fo'(x) = f(x), this gives at once 
(42a) Fy @*)(x) = f(x). 


Therefore F(x) is that function whose (n + 1)th derivative is equal 
to f(x) and which, together with its first n derivatives, vanishes for 
x = 0; it arises from Fn_-1(x) by integration from 0 to x. Hence, F'n(x) 
is the function obtained from f(x) by integrating n + 1 times between 
the limits 0 and x: | 


(42b) Fox) = f fly) dy, Fa(x) = f Foly) dy, 
Fu(x)= f ” Fily) dy,..., Fr(x) = f ” Fa- (Y) dy. 


This repeated integration can therefore be replaced by a single in- 
— n 
tegration of the function (x — y)” > ) f(y) with respect to y. 

The rules for differentiating an integral with respect to a parameter 
often remain valid even when differentiation under the integral sign 
yields a function that is not continuous everywhere. In such cases, 
instead of applying general criteria, it is more convenient to verify 
directly whether such a differentiation is permissible in each special 
case. 

As an example, we consider the elliptic integral (cf. Volume I, p. 


299). 


+1 dx 
— —_— Ss 2 
F(k) = f (1 — x*)(1 — k?x?) ’ k? <i. 


The function 
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1 
f(k, x) = V¥(1 — x2)\(1 — k2x?) 
is discontinuous at x = +land at x = —1, but the integral (as an im- 


proper integral) has a meaning. Formal differentiation with respect 
to the parameter k gives 


kx? dx 


F= |. Va 20 — ee 


To investigate whether this equation is correct, we repeat the 
argument by which we obtained our differentiation formula. This 
gives 

— +1 
Minots f falk + Oh, x) dx 
_ f +l (k + 0h)x? dx 
1 vV — x) [1 — (k + 6h)2x2] 3° 


The difference between this expression and the integral obtained by 
formal differentiation is 


a= f” x? jt co” as la 
T Ja vi- x2 \ VfL — (k+ ohx] VA ker A 


We must show that this integral tends to 0 with A. For this purpose 
we mark off about k an interval ko <k < kı not containing the values 
+1, and we choose h so small that k + Oh lies in this interval. The 
function 


o k _ 
VG — kx? 


is continuous in the closed region —1 < x < 1, ko < k < kı, and is 
therefore uniformly continuous. The difference 


| k + 9h k 
V e F h VO Rr 


consequently remains below a bound e that is independent of x and 
k and which tends to 0 with h. Hence, 


tl x? dx 
ais f° c= Me 
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where M is a constant independent of £. That is, the integral A tends 
to 0 as h does, which is what we wished to show. 

Differentiation under the integral sign is therefore permissible in 
this case. Similar considerations apply in other cases. 

Improper integrals with an infinite range of integration and de- 
pending on a parameter will be discussed on p. 462. 


Exercises 1.8b 
1. Let 
b 
F(k) = f, a(x) B(x, k) dx, 
where B(x, k) and B(x, k) are continuous for a<x<b,ko<k < kı, 


e . b e . 
and «(x) is continuous for a < x < b, and Í, |a(x)|dx exists as an im- 


proper integral. Prove that 
Fk) = | a) B(x, k)de for ko<k< ki. 
2. Let 
F(k) = | (œ — 1st logis dx for —1<k. 
Prove 
(a) lim k F(k)=1 


2+k 


(b) F(R) = log IFR 


c. Interchange of Integrations. Smoothing of Functions 


The theorem on p. 74 about differentiation under the integral sign 
has the important consequence that we can interchange orders of 
integration. 

Let f(x, y) be continuous in the rectangle R given by 


(42c) asxsb asySpB. 
Then the integrals 
b p B b 
(42d) I=f df f&nd and J= f dnf, fe ndae 
have the same value. We call this value the double integral of f over 


the rectangle (42c). 
As an example we consider the function f(x, y) = y sin (xy) in the 
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T 


rectangle 0<x<1,0<yz< 3° Here 
_f} r2 o o, _ (1/(_ nm cos(n6/2) | sin (m &/2) 
T= f dg f7 asinga = f (- 12FED 5 sin ESP) ag 
—T 
=3- 


n/2 1 n/2 x 
J={ dn f n sin Gn) dg = Í (1 — cos n) dn = 2 —1, 
0 0 0 


For the general proof of the identity I = J, we introduce the in- 
definite integrals 


(x,y) =f fe, nda ul, y) = f7 v6, ») db. 


Applying formula (40) we find 


u(x, y) = f vE, y) dë = [ C, y) dé 


and thus 
u(x, y) = u(x, a) + f° uyl, n) dn = f” dn f7 FE, n) dé 


For x = b, y = 6 it follows that I = J. 
We have associated here with a continuous function f(x, y) in the 
rectangle R a function u(x, y), which has continuous first derivatives 


Uxx, y) = fi fe, ndn, u(x, 9) = f „ [6 y) dE 
and a continuous mixed second derivative 


Uxy(x, y) = f(x, y). 


We shall use the function for the purpose of “smoothing” f, that is, 
for constructing uniform approximations to f that have continuous 
partial derivatives. 

For technical applications it often is essential to replace a con- 
tinuous function f (itself perhaps only an approximation to an imper- 
fectly known physical quantity) by a smooth function nearby. We 
know from the Weierstrass approximation theorem (Volume I, p. 569) 
that functions of one independent variable, continuous in an interval, 
can be approximated uniformly by polynomials, which even have 
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derivatives of all orders. The analogous theorem holds for functions 
f(x, y) continuous in a rectangle. 

We can construct simpler approximations with a more moderate 
degree of smoothness by the process of “averaging” the function 
f(x, y). It is convenient here to have extended the definition of f from 
its rectangular domain (42c) to the whole x, y-plane so that f is con- 
tinuous everywhere.! For any h > 0 we form the average of f over the 
square of center (x, y) and sides of length 2h parallel to the axes: 


(42e) Fe) = gal, af" Ae wan 


xr—h 


_ uxthyt+h)—ux«th, y—h)-—ux«c—h,y+h)+u(x—h, y—h) 
E 4h? 


It is clear that F(x, y) has continuous first derivatives and a con- 
tinuous mixed second derivative.? In order to see that F(x, y) ap- 
proximates f(x, y) for small h, we note that 


CD Fed) fle = a a, ES E- fe van 


Since f is uniformly continuous in some rectangle R’ containing R 
in its interior, we know that f for given € and sufficiently small A will 
vary by less than € in every square of side 2h contained in R’. Then 
If&, n) — f(x, y)| < in (42f), and | Fi(x, y) — f(x, y)| < £. Hence 


lim F(x, y) = f(x, y) uniformly for (x, y) in R. 
>0 


Thus we can find a smooth function F(x, y) arbitrarily close to 


f(x, y). 
1.9 Differentials and Line Integrals 


a. Linear Differential Forms 


In Section 1.5d we defined the total differential du of a function 
u = f(x, y, z) as the expression 


1This can be achieved by continuing f as constant along rays perpendicular to one of 
the four sides of the rectangle and by continuing f into the remaining points of the 
plane as constant along rays from one of the four corners. 

2In order to have F(x, y) defined for all points of the rectangle R, we have to have 
f defined somewhat beyond R. 
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_ Of(x, y, z) Of(x, y, 2) Of(x, y, 2) 
(43) du = EEr rE dx + ay dy + o dz. 


This definition for the differential of a function of several variables 
is suggested by the chain rule of differentiation. For if x, y, z are given 
functions of a variable ¢, 


(44) x= t) y=v), z= xt), 


then the derivative of the compound function u = f [ọ(¢), vid), x(t)] 
according to the chain rule (19) is 


du _ ôf dx | af dy _ af dz 
(45) dt ðxdt + ðydt  ðzdt` 


For functions u of a single variable t the differential has been defined 


as du = K dt. Hence, here by (45) 
_ (Af dx, af dy , af dz 
du = P dt * oy dt tz AR 
_ôf dx 4, _ ôf dy y , of dz 
~ Ox dt dt + dy at di + dz dt dt, 


which formally agrees with (43) if we remember that x, y, z (as func- 
tions of t) have the differentials 


dy dz 
dx = — dt, dy = dt 2b dz = dt dt. 
Thus the differential du = df(x, y, z) as given by (43) furnishes 
du 
dt 
sented parametrically in the form (44). 

The differential du as defined by (43) is a function of the six varia- 
bles x, y, z, dx, dy, dz that is linear and homogeneous! in the variables 
dx, dy, dz, with coefficients that are functions of x, y, z. (There is, of 
course, no requirement that the differentials dx, dy, dz have to be 
“small” in any sense; such a restriction only arises if we want to use 
du as an approximation to the increment 


immediately the differential du = dt of u “along any curve” repre- 


1The most general linear function of three variables &, n, Cis A% + Bn + C% + 
D with coefficients A, B, C, D not depending on &, n, C; the linear function is called 
“homogeneous” or is said to be a “linear form” when D = 0 (see p. 13). 
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Au = f(x + dx, y + dy, z + dz) — f(x, y, 2) 


as explained on p. 42). 
The most general linear differential form in x,y,z-space is repre- 
sented by the expression 


(46) L = A(x, y, z) dx + B(x, y, z) dy + C(x, y, z) dz. 


It is a function L of the six variables x, y, z, dx, dy, dz that is a linear 
form in the “differential” variables dx, dy, dz, with coefficients de- 
pending on x, y, z. The total differentials du of functions are the 
special linear differential forms L that have coefficients of the form 


_ f(x, y, 2) _ of(x, y, 2) _ of(x, y, 2) 
(47) A= Ox =’ B= dy ’ C= dz’ 


for a suitable function f = f(x, y, z). If a differential form L is the 
total differential of a function, we say it is an exact differential form or 
is integrable. Not every differential form is integrable; it 1s necessary 
that the coefficients A, B, C of L satisfy certain “integrability con- 
ditions”’: 

If the coefficients A, B, C of the differential form L are of class C! 
(that is, have continuous first derivatives, see p. 42) and if L is exact, 
then the equations 


0B ac aC dA 0A OB _ 


hold. 

Equations (48) simply are consequences of the rules for inter- 
changeability of second derivatives. If A, B, C have continuous first 
derivatives and can be written in the form (47), then f has continuous 
second derivatives. Hence, by the theorem on p. 36, the order of dif- 
ferentiation does not matter. Thus, for example, 


3A aaf adaf 4B 


dy  ðyð3x ðxðy ax’ 


and similarly for the other identities in (48). 
Hence, for example, the linear differential form 


L = y dx + z dy + x dz 


is not integrable, since here 


Functions of Several Variables and Their Derivatives 85 


On the other hand, the integrability conditions (48) are satisfied for 
the differential form 


L = yz dx + zx dy + xy dz, 


which, as a matter of fact, is the total differential du of the function 
u = xyz. To what extent the conditions (48) also are sufficient for 
expressing L as a total differential will be discussed in Section 1.10. 

Similar conditions for integrability are obtained when the num- 
ber of dimensions is other than three. For two independent variables 
x, y the general linear differential form is L = A(x, y)dx + B(x, y)dy. 
If L is the differential du of a function u = f(x, y) the coefficients 
A, B satisfy the equation 


3A 3B 
Oy Ox 


In four dimensions, on the other hand, we obtain corresponding to 
equations (48) six integrability conditions by forming all possible 
mixed second derivatives of a function f of four variables. 

The reason why it makes sense to consider a differential form L 
even when it is not an exact differential is that, along any curve C 
given parametrically in the form 


x=) y=vi), 2=x(%), 


L becomes the differential 


dx dy 
di + Pat Cae 


dz 


L=(A jat 


of a function of a single variable. This function is simply the one 
given by the indefinite integral 


dy dz 
fu= flag + BY + Pat 
b. Line Integrals of Linear Differential Forms 


For the purpose of discussing integration of linear differential 
forms over lines, it is important to have a clear picture of the con- 
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cepts and properties of oriented arcs and closed curves. The reader is 
advised to reread Volume I, pp. 333-340, where all the relevant re- 
marks are made for the case of plane curves. These apply equally well 
to curves in spaces of any number of dimensions.! Without restriction 
of generality we shall talk about integrals over curves in three-dimen- 
sional x, y, 2-space. 

A simple arc T is a set of points P = (x, y, z) that can be repre- 
sented parametrically in the form 


(49) x=, y=), 2=x0; astsd, 


where 9, y, xy are continuous functions of t for a < t < b, and dif- 
ferent t in that interval correspond to different points P. The parame- 
tric representation (49) constitutes a 1-1 continuous mapping of the 
interval on the t-axis onto the set I in space.? The same simple arc 
T has many different parametric representations. The most general 
one is obtained from the particular representation (49) by taking any 
continuous monotone function u (t), mapping the interval a<t=<f 
onto the interval a < t < b, and setting 


(50) x=dlu(r), y=vu, z=xu@]; astsB. 


There are two ways of ordering the points of I, which in any 
particular parametric representation (49) correspond to ordering 
according to either increasing or decreasing t. The choice of one of 
these two orderings converts I into an oriented simple arc IT*. We 
say that T* is oriented positively with respect to the parameter t if 
the orientation of I* corresponds to increasing t and negatively if 
it corresponds to decreasing t. The oriented simple arc with the 
opposite orientation is denoted by —I*. The orientation is fixed 
completely if we know the order of any two points Po, Pi on I. If 


1Specifically two-dimensional are only the notions of “positive and negative side” 
of a curve and of “clockwise and counterclockwise sense.” 

2The continuity of the mapping from ¢ onto P is obvious from the assumed continuity 
of the functions 9, y, x. It is important to realize that the inverse mapping P > t 
also is continuous. This means that given a sequence of points Pn on T converging 
to a point P the corresponding parameter values £n converge to the parameter value 
for P. For the proof we observe that by the compactness property of closed and bound- 
ed intervals (Volume I, p. 95) a subsequence of the tn converges to some value t with 
a Z t <b. By the continuity of the original mapping, t is mapped on the limit P of 
the Pn. Because of the assumed 1-1 character of the mapping, t is determined unique- 
ly by P. Hence, every convergent subsequence of the fn has as limit the parameter 
value t corresponding to P. This proves, however, that the whole sequence of the tn 
converges to f. 
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I'* is oriented positively with respect to the parameter t and if to and 
tı are the parameter values for Po, Pi, then to < tı means that P: 
follows Po or Po precedes Pı on T* (Fig. 1.17). 


IP 
zt 
PØ 
P 
PØ 0 
B 
i n d dt 
a to ti b 
Lnb d d ‘Tr 
æ Ti TO B 


Figure 1.17 Simple arc in space oriented negatively with respect to parameter T, 
positively with respect to parameter t = u(t), where (a) = b, p(B) = a. 


The end points of the oriented simple arc I* correspond in the 
parametric representation (49) to the values t = a, bin some order. We 
distinguish them respectively as “initial”? and “final” point of I*, 
the initial end point being the one that precedes the other one. If I* 
has the initial point A and final point B we write 


r* = AB 
The oppositely oriented arc is then 
-I* = BA 


If [* is oriented positively with respect to t, the initial point has 
parameter value a, and the final point, parameter value b. 


An oriented simple arc ['* = AB can be divided into oriented sim- 


ple subarcs [1*,,. . ., Ta* by points Pi, . . ., Pn-1 on I'* following 
each other according to the orientation. We put Po = A, Pa = B and 
define fori = 1, . . . ,n the arcT :* as the set of points on I’* consist- 


ing of Pi_1, Pi and all points preceding P; and following Pi-1, ordered 
in the same way as on I*. We write symbolically 
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(51) Pe = r*+ r*+ e + e +I" 


If T* is oriented positively with respect to the parameter t in the 
representation (49) and if t; is the parameter value corresponding to 
P;, we have 


a=to<ti<t<-+++<t=b., 


The arc I;* is obtained when we restrict t to the interval t_1<t< 
tı (Fig. 1.18). 


Figure 1.18 Oriented arc [* = AB represented as sum of 
arcs [441* = P; Pii such that F* = Di* + r2* + T3* + Wa* + Ts*. 


We are able now to define the integral f L of the linear differential 
form 


(52) L = A(x, y, z) dx + B(x, y, z) dy + C(x, y, z) dz 


over a simple oriented arc I*. We assume that the coefficients A, 
B, C of L are continuous in a neighborhood of r*. We make the 
further assumption that the arc I* not only is continuous but 
sectionally smooth, that 1s, that it can be represented parametrically 
by functions 


(53) x=), y=), z=x@); asStssd, 
which are sectionally smooth.+ 


1This means that 9, y, x are continuous for a St Sb and have continuous first 
derivatives in that interval except possibly for a finite number of jump-discontinui- 
ties of the derivatives. Notice that we require only the existence of some sectionally 
smooth parametric representation of I *, while other representations need not be 
smooth. 
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Let Po, Pi,..., Pn be any n + 1 points of I* following each 
other in the order determined by the orientation of I'*, where Po is 
the initial, and Pn the final, point of T*. 

We form the Riemann sum 


(54) Fn = "(Ay Axy + By Ayv + Cy Az). 
v=0 


Here Av, By, Cy are the values of A, B, C at some point Qv that 
precedes Pyi1 and follows Py on I*, and Axv, Ayy, Azv stand for 


x(Pv+1) — x(Pv), y(Pv+1) — y(Pv), 2(Pvi1) — 2(Pv). 


We shall show that for n — oo the sequence of Fn converges toa limit 
F, provided that the largest distance between successive points Py, 
Py.1 tends to 0. The value of F does not depend on the particular 
choice of the points Py or of the intermediate points Qv. We call F the 
integral of the form L over the oriented arc I’*, and write 


(56) F= ,L=f ,Adz+Bdy+Cdz 


Since the definition of the integral does not refer to parametric re- 
presentations, it is clear that the integral does not depend on the 
choice of parameters. The existence proof will imply that the integral 
is represented by the ordinary Riemann integral 


bi dx dy dz 
56 f r=:f AŽ 4 BË 4 CÊ) dt 
(56) - | dt Pdt + CT) 


Here the integrand is the function of the single variable t obtained 
by substituting for the arguments x, y, z of A, B, C their expressions 
(53); moreover, € = +1 when I* is oriented positively with respect 


to t and £ = —1 when oriented negatively. Without distinguishing 
cases we can also write (56) as 

— (F(,9% , pW, paz 
(57 fe L= f: wer + Bit Cail at, 


where t; is the parameter value for the initial point and ty that of the 
final point of the oriented arc I*; that is, ti = a, tj = b when € = 
+1, and & = b, tt = a when € = —1. 

To prove convergence of the Riemann sums Fn, we make use of the 
sectionally smooth parametric representation (53) of T*. Let tv be the 
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parameter value corresponding to the point Py. Since the corre- 
spondence between parameter values and points on the curve is 
continuous both ways for simple arcs (see footnote on p. 86), we see 
that as the largest distance between successive points tends to 0, 
the largest value of |tvi1 — tv| tends to 0 for n >. The functions 
g(t), v'(t), x(t) may have jump-discontinuities at a finite number 
of points. We can assume that all those points of discontinuity occur 
among our subdivision points fo, ti, . . . , tn, for since the A, B, C 
are bounded and the largest of the Axv, Ayv, Azv tend to 0 for n > œ, 
the effects of adding or subtracting contributions from a fixed finite 
number of subdivision points in the Riemann sum, Fn, disappear in the 
limit. | 

Since o(f), w(é, x(t) are now differentiable in the interior of 
each subinterval, we can apply the mean value theorem of differential 
calculus (see Volume I, p. 174) and find 


Axv = 9(tv+1) — (tv) = @’(tv)(tv41 — tv) 
Ayv = y (tv )(tv+1 — tv). Az = x’ (tv! )(tva1 — tv), 


with values ty, tv’, tv’ intermediate between tv and fvi1. The point 

Qv on I'* also corresponds to a parameter value ov intermediate 

between fv and tvi1. Hence, the Riemann sum F» in (54) takes the form 
n- 


Fn = $ [ACV (T) + Blow) Yv) + C(ov) Xv) lv — t. 


v=0 


Here the points to, ti, ... . , fn form a subdivision of the parameter 
interval [a, b]. If [* is oriented positively with respect to t, the tv 
form an increasing sequence with to = a, tn = b, and Atv = tv41 — tv 
> 0. Otherwise, the tv are decreasing, to = b, tn = a, and Atv < 0. 
In our notation for the parameter interval, a always stands for the 
smaller one of the values a, b and thus may correspond to either the 
initial or the final point of the arc I*. 

If we now use the fundamental existence theorem for definite inte- 
grals as limits of Riemann sums (see Volume I, pp. 192 ff.), we find that 
F = lim F, exists and is given by formula (56).! The factor € = +1 


nro 
arises from the assumption made in that theorem that the points of 
subdivision tv used in forming the Riemann sum constitute an in- 
creasing sequence. When the orientation of I* corresponds to 


1The intermediate values ty, Ty’, Ty’, Ov need not be the same for convergence (see 
the remarks on p. 195, Volume I). 
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decreasing t, we have to run through the values tv in opposite order, 
starting with tn and ending with to, and change the sign of Atv. 

It is clear that the definition of line integral and the formula (56) 
can be extended to the case where I* is an oriented simple closed 
curve.1 In this case we form the Riemann sum by selecting n points 
Pı, P2, . . . , Pn on T* that follow each other in the order determined 
by the orientation, and we put Po = Pn in the expression (54) for F'n. 

Instances of integrals over curves in the x, y-plane have been 
encountered already in Volume I. Thus, the oriented area bounded by 
a closed oriented curve I'* had been represented in the form 
1 “(x dy dx 


2 


dt ya di 


(see Volume I, p. 365); that is, as the line integral 
A= Al x dy — y dx 
2 J p» 


Another example is furnished by the work W done by a field of force 
with components p, o in moving from a point Po to a point Pı along 


a curve [* = PoP, referred to arc length s as parameter. Here (see 
Volume I, p. 420) 


which can be written as 
W = f p dx + o dy. 
r* 


In the same way we can define the work done by forces in space with 
components p, ©, t, in moving along an arc I* in the direction 
given by its orientation as a line integral 


w= p dx + o dy + t dz. 
r* 


1Such a curve has a continuous parametric representation (53), with different t 
corresponding to different points, except that t = a and t = b yield the same point. 
Moreover a cyclic order is specified on ['*, corresponding to either increasing or 
decreasing t (see Volume I, p. 339). We can always represent I’* as sum of oriented 
simple arcs I;* in the form (51), where for i = 2, . . . , n the final point of T;*-1 is 
the initial point of T;* and where the final point of Tn* is the initial point of T'1*. 
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Exercises 1.9b 


1. Find 
Jfedx+xdy+y dz 
(a) over the arc of the helix 
x = cos t, y = sin t, z=t 

joining the points (1, 0, 0) and (1, 0, 27); 

(b) over the parabolic arc 
x = xo(l1 — t), y=vyol(l — 2), z=t 
joining the points (0, 0, 1) and (0, 0, —1) (for constant xo, yo). 


c. Dependence of Line Integrals on End Points 


We return to the general differential form L given by (52). Let T be 
a simple arc (not yet oriented) with a sectionally smooth parameter 
representation (53). 

For any two points Po, Pı on T corresponding to the values to, tı 
of the parameter t, we can form the integral 


dz 


ALI 


I= fel [a+ Bo 24C 


By formula (57), I is equal to f L extended over the oriented subarc 
PoP: of T that has Po as initial and Pı as final point. It follows that 
I does not depend on the particular parameter representation. We 
write 


r=f°1 


The value of I is determined by the ordered pair of points Po, Pı and 
the simple arc of which they are end points. 

For fixed Po we can define a function f = f(P) along the arc T by 
the indefinite integral 


P 
(58) fP)= | L= (af +B2 +c ae 

Po al 
Taking f as a function of the independent variable t, we then have 
(59) df 49%, BB yE, 


dt dt dt dt ` 
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Writing this equation as 


df =F dt = A dx + Bdy+Cdz=L, 


we thus express the linear differential form L (which need not be 
exact) as the differential of a function f; but we have to remember 
that this relation holds only along a special curve F on which f is 
defined. 

For any points P and P’ of T 


P! 
(60) J L= RP) - AP). 


This follows immediately if we express the line integrals as integrals 
over the variable ¢ and apply the fundamental connection between 
definite and indefinite integrals (see Volume I, p. 190). If r*, the arc 
I with a certain orientation, has the initial point A and the final 
point B, we find, in particular, that 


(61) f L= L=1B -fA 


If Po,..., Pn are points on I* in the order determined by the 
orientation of I*, with Po = A, Pn = B, we have 


L = f(B) — (A) = $ IAP) — Po) 
~ pm Ja, L. 


If we denote by Fv+1* the subarc with initial point Py and final point 
Pvsi, we have 


Py Ty." 
Here the orientation of I'v* agrees with that of T so that 
I* = 0y*¥ + Pe*F¥ +--+ 6 Ta. 


Therefore, line integrals are additive: 


(62) f L=], L+:-+f 1L 


WE TEES i r,* 
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Similarly, if we interchange the end points of I*, 


(63) J r=- fa E 


These rules are of particular interest when applied to oriented 
closed curves represented as sums of oriented simple arcs. Consider a 


number of oriented simple closed curves C1*, . . . , Cn*(see Fig. 1.19), 
C* C* 
C* C* 


Figure 1.19 Additivity of line integrals over closed curves. 


which may have portions in common. Assume that a simple arc 
IT common to two of the curves, C:* and C;*, receives opposite orien- 
tations from C;* and C;* and that the portions of the curves not com- 
mon to any two of them add up to an oriented closed curve C*. Writing 
each line integral over a curve C;* asthe sumof integrals over simple 
arcs and adding all these integrals, the contributions of the common 
arcs cancel out and we are left with the formula 


(64) Joba a Et L 


Cn* 


This situation arises, in particular, when the C;* are plane curves 
forming the boundaries of nonoverlapping two-dimensional regions 
R; that together form a region R with boundary curve C*, all C;* and 
C* having the same orientation. More generally, the region R and 
its boundary C* may lie on a surface, and R may be subdivided by arcs 
into subregions R; with boundary curves C;* whose orientations fit 
together in the manner described. 

A somewhat different application of the same principle occurs in 
the following theorem. Let two oriented closed curves C* and C’* 
(see Fig. 1.20) be subdivided by the points Ai,...,AnandA1’',..., 
An’, respectively, in the order of the sense of orientation, and let each 
pair of corresponding points A; and A;’ be joined by a curved line. If 
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Figure 1.20 


by C;* we denote the closed oriented curve AjAi+1Ai+1'Ai’ (identifying 
An+1 With Ai and An+1’ with A1’), then 


(65) Sf 


1.10 The Fundamental Theorem on Integrability of Linear 
Differential Forms 


a. Integration of Total Differentials 


A particularly important class of differential forms 
(66) L = A dx + B dy + C dz 


are the total differentials of functions u = f(x, y, z), with A, B, C of 
the form 


(67) a=, B=% c=%, 
where f is a function with continuous first derivatives. While in 
general the value of f, L depends not only on the end points but 
on the entire course of the curve, the following theorem is valid 
here: 

The integral of a linear differential form L, which is the total dif- 


ferential of a function f, is equal to the difference of the values of f at 
the end points and does not depend on the course of T* between those 


96 Introduction to Calculus and Analysis, Vol. II 


points. That is, we obtain the same value for f.» L for all curves I* 


which lie in the domain of f and have the same initial point Po and 
the same final point Pı. 

For the proof, let the curve I* be referred to a parameter t where 
to corresponds to the initial point Po and tı to the final point Pi. By 
(57), p. 89 


dz 
fal = ; (aS + Bo + 0F) at. 
By the chain rule of differentiation [see formula (18) p. 55] we then 
have 


(68) far = S E ai= f? = RP - Keo. 


where we write 


f(Pi) = f(xlti), Yt), z(t:)) 
for i = 0,1. 

We observe that instead of requiring that the integral is inde- 
pendent of the path, we might just as well require that the integral 
over a simple closed curve I* has the value 0, for if we divide the 
curve I'* by means of two points Po and Pı into two oriented arcs 
rı* and ['e*, we have 


re = Ty* + T2*, 


where, say, lı has initial point Po and final point Pı, while T2* has 
initial point Pı and final point Po (see p. 94). Then 


fis L= | «L+ r b= meL- f aL 


Here —T>* has the same initial point Po and the same final point 
Pı as T1*. The vanishing of f L over the closed curve I* means exactly 
the same thing as the equality of L taken over the two simple arcs that 
have Po as initial point and Pı as final point. 


b. Necessary Conditions for Line Integrals to Depend Only on 
the End Points 


Only under very special conditions is a line integral independent 
of the path or, what is equivalent, is the line integral round a closed 
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path 0. For example, if a closed curve C* in the x,y-plane forms the 
boundary of a region of positive area, then the line integral 
f(x dy —y dx) over C* is not 0. We proved in the preceding section 
that for the independence of f L from the path joining the end points, it 
is sufficient that L is a total differential. The chief task of the theory of 
line integrals is to show that this condition is also necessary and then 
to express this necessary and sufficient condition in a form convenient 
for applications. 

We shall investigate this question of independence for integrals 
over curves in three-space. But the results and proofs are exactly 
analogous in any number of dimensions. We make the assumption that 
L = A dx + B dy + C dzis a linear differential form with coefficients 
A, B, C that are continuous functions of x, y, z in an open set R of 
space. The following theorem then holds: 


The line integral f L taken over a simple oriented arc T* in R is 
independent of the particular choice of ['* and determined solely by 
the initial and final point of T* if and only if L is the total differential 
of a function f(x, y, z) in R. 

We have already proved on p. 95 that this condition is sufficient; 
that is, for an exact differential L = A dx + B dy + C dz the integral 
f L is independent of the path. It is easy to see that the condition is 


necessary. Assume that Sv L depends only on the end points of I*. 


We want to show that there exists a function u(x, y, z) defined in R 
for which du = L. With no loss of generality we can assume that 
every two points of R can be connected by a simple polygonal arc 
that les completely in R.! We pick a fixed point Po in R and define 
the function u = u(x, y, z) = u (P) at any point P of R as f L extended 
over any simple arc with initial point Po and final point P. In order 
to compute the partial derivatives of u, we consider any point (x, y, z) 
= P of R (Fig. 1.21). Since R is open, all points (x + h, y, z) = P’ 
will then also belong to R provided |h| is sufficiently small. Let y* 
denote the oriented straight line segment joining P and P’, while T* 
shall denote a simple polygonal path joining Po to P. We can always 
modify I* slightly to bring about that the last side of this polygonal 
arc, which has P as final point, is not parallel to the x-axis. Then T* 
and y* have no point in common besides P (at least for |h| sufficiently 


1The open set R can always be decomposed into connected subsets that have this 
property (see Appendix 112). We then define u in each of these subsets by the con- 
struction indicated. 
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Figure 1.21 


small), and IT* + y* represents a simple arc with initial point Po and 
final point P’. It follows [see (62, p. 93)] that 


u(x + h, y, 2) — u(x, y, 2) = u(P’) —uP)=f, pL fa L5 fL 


x+h 
f AG»,2 dt 
T 


Dividing by h and passing to the limit with h —> 0, we find that indeed 


Ou(x, y, z) _ 
dx = A, 


and similarly du/dy = B and du/dz = C. This shows that du = L. 


c. Insufficiency of the Integrability Conditions 


The theorem on independence of line integrals we just proved 1s, 
however, of no great value unless we have some way of finding out 
whether a given differential L is a total differential or not. It is 
desirable to have some condition that involves only the coefficients 
A, B,C of L = A dx + B dy + C dz and is easily verified. We have 
already recognized the integrability conditions 


aB aC_, @C_0A_, @A_@B_ 
(69) 3z ay 7O ax az oy 3x 7O 


as necessary for the existence of a function u = f(x, y, z) with the 
property that L = du. A form L satisfying (69) is called closed. Hence 
every exact form is closed. Since line integrals can be independent of 
the particular path joining any two points only when L is a total 
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differential, we see that conditions (69) are necessary, if L is to depend 
only on the end points of the path of integration. Are these conditions 
also sufficient? They are sufficient if they permit us to construct a 
function u = f(x, y, z) for which 


of of of 

(70) A= Ag? B= ay” C= aa 
The surprising result is that the integrability conditions (69) suffice 
almost, but not quite, to ensure that L is the total differential of a 
function u and, hence, to ensure the independence of f L from the 
path. The identities (69) in themselves are not sufficient but become 
so if we add an assumption of quite a different character, one that 
concerns a geometrical property of the region in space in which L is 
considered. 

A simple counterexample shows that conditions (69) alone are not 
sufficient to guarantee that f L taken over any closed curve is 0. We 
consider the differential 


x dy — y dx 


(71) Le 


corresponding to the choice of coefficients 


—_ _—y _ —_* _ — 
A x? +y?’ B x? + y?’ C=0, 


which are defined except for points on the line x = y = 0 (the z-axis). 
One verifies easily that the integrability conditions (69) are satisfied 
and thus that L is closed. When we integrate around the unit circle 
C*: x = cos t, y = sin t, z = 0 in the x,y-plane, oriented positively 
with respect to t, we find 


2n 
fez =f" | AS + Be D) dt = f (sin?t + cos?t) dt 
0 
= 2n ~ 0. 
As a matter of fact, it is easy to calculate fL around any closed curve 


C for the L given by (71). We introduce the polar angle 0 of a point 
P = (x, y, 2) by 


2 - —*__ _ 
(72) cosO= Tepe? NO = ep 
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that is, the angle formed with the x,z-plane by the plane through P 
passing through the z-axis (see Fig. 1.22). Then 


(73) dð = d arc tan * = L, 


Figure 1.22 


so that L is represented as total differential of the function u = 0. 
The complications arise from the fact that formulae (72) define the 
values of 0 only within whole multiples of 2x. Starting with some 
possible values § for 0 at a point Po, we can define 0 in any point 
P by joining P to Po by a continuous curve and taking 


WP) =O) + f do=0+fE 
0 


(See Volume I, p. 434). But 6(P) defined in this way is multiple- 
valued depending on the choice of the curve: for a closed curve C* 


the expression 
1 
On Í, do 


represents the number of times C winds around the z-axis in the 
clockwise sense (see Fig. 1.23). Hence, the value of 


P 
(74) f 0 
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fdo =0 


z 


Figure 1.23 


taken for two different paths with end points Po, P is the same only 
if going along one path from Po to P and returning along the other 
path to Po we go zero-times around the z-axis. We can prevent any 
path from going around the z-axis by considering only points (x, y, 2) 
with either y Æ 0 or with y = 0 and x > 0, erecting, in a manner of 
speaking, a wall along the half-plane 


which is not to be crossed. The points not excluded form a region R 
in which we can assign to 9 a unique value with 


—z<@0<7 


that constitutes a continuously differentiable function 8 = @(x, y, z) 
with differential L. The integral (74) extended over any path in 
the region that joins P and Po has then a unique value 0(P) — 8(Po), 
which does not depend on the particular path. Similarly, the integral 
over a closed path in this region has the value 0. 
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d. Simply Connected Sets 


In order to formulate the fundamental theorem generally we need 
the notion of a simply connected! open set. In such a set R, any two 
points can be joined by a path lying in R, and any two paths in R with 
the same end points can be deformed into each other without moving 
the end points and without leaving R. 

We give precise definitions of these notions. A path C in R joining 
two points P’ = (x’, y’, z) and P” = (x”, y”, z”) means three con- 
tinuous functions (2), w(t), x(t) defined in the interval OS t= 1 
such that the point P(t) = (o(d, w(t), yd) lies in R for all t of the 
interval and coincides with P’ for t = 0 and P” for t = 1.2 The set R 
is called connected? if every two points P’ and P” of R can be joined 
by a path in R. Actually it is easy to see that they can then be joined 
also by a smooth simple arc in R, provided the set R is open.* 

Trivial examples of connected sets are the convex sets R, charac- 
terized by the property that any two of their points P’ and P” can be 
joined by a line segment in R. Here we can choose as linear path with 
end points P’ = (x’, y’, 2’) and P” = (x”, y”, z”) simply the triple of 
linear functions 


QA = (1 Hx +t, A=- Hy + ty”, 
y(t) = (1 — t) 2’ + tz” 


for 0 < t < 1. Examples of such convex sets are solid spheres or cubes. 
Examples of connected, but not convex, sets are a solid torus, a 
spherical shell (i.e., the space between two concentric spheres), and 
the outside of a sphere or cylinder. Any set R whatsoever in space 
if it is not connected consists of connected subsets called the com- 
ponents of R. Disconnected are, for example, the set of points not 


1More precisely ‘“‘pathwise simply connected.” 

2Different t need not correspond to different P(t). Notice that the description of a 
path does not only include the set of the points P(t) in space (the “support” of the 
path) but also the choice of corresponding parameters t. Every simple arc in space 
determines many different paths corresponding to different parameter repre- 
sentations of the arc. We can always bring about by a linear substitution that the 
parameter values vary over the particular interval 0 St S1. 

3More precisely ‘pathwise connected.” 

4Taking a sufficiently fine subdivision of the parameter interval and joining cor- 
responding points P(t) by line segments, we first obtain a polygonal arc in R joining 
P’ and P”. Omitting loops we get a simple polygonal arc. Replacing small portions 
near a corner by suitable parabolic arcs, we get a smooth simple arc in R joining 
P’ and P”. See also p. 112. 
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belonging to a spherical shell or the set of points none of whose 
coordinates is an integer. 

Let Co and Ci be any two paths in R, given respectively by 
(Polé), Wo(t), xo(t)) and (91(2), wi(d), x1(2)). Their end points P’, P”, cor- 
responding to t = 0 and ¢ = 1, shall be the same. The connected set 
R is simply connected, if we can ‘“‘deform Co into C1” or “join Co and 
Cı” by means of a continuous family of paths Cy, with common end 
points P’, P”. This shall mean that there exist continuous functions 
(p(t, A), w(t, A), x(t, à) of the two variablest,A forO0 <t<1,0<A<1, 
such that the point P = (9, y, x) always lies in R and such that P 
coincides with (@o, Wo, Xo) for A = 0, with (91, Wi, xı) for 4 = 1, with P’ 
for t = 0 and with P” for t = 1.1 For each fixed A the functions 9, y, x 
determine a path Ch in R that joins the points P’ and P”. As i varies 
from 0 to 1, the path Ci changes continuously from Co to Ci, and in this 
sense represents a “continuous deformation” of Co into Ci (see Fig. 
1.24). 


Figure 1.24 


As is easily seen, convex sets R are simply connected. We only have 
to associate with the two curves Co, Cı having common end points 
P’, P” the curves Ci given by 


p(t, A) = (1 — A) pot) + AGu(Z) 
w(t, A) = (1 — A) yot) + Awad) 
x(t, 4) = (1 — A) Xolt) + Ayaild). 


1The paths C and Ci are called homotopic relative to P’, P”. 
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Here Ch is obtained geometrically by joining points of Co and Ci that 
belong to the same ¢ by a line segment and taking the point that 
divides the segment in the ratio A/(1 — 4). The points obtained in this 
way all he in R because of the convexity of R. A different type of 
pathwise simply connected set is represented by a spherical shell. Not 
simply connected, on the other hand, is the set R obtained by re- 
moving the z-axis from x, y, 2-space. Here the two paths (semicircles) 


x = cos zt, y = sin zt, z= 0; 0O<t<1 
and 
x = cos nt, y = —sin zt, z=0; 0Ost<l 


have the same end points but cannot be deformed into each other 
without crossing the z-axis, which does not belong to R.1 


e. The Fundamental Theorem 


We can now state the relation between the notions of closed and of 
exact differential forms: 

If the coefficients of the differential form L = A dx + B dy + C dz 
have continuous first derivatives in a simply connected set R and satisfy 
the integrability conditions 


(75a) Bz — Cy = 0, Cz — Az = 0, Ay — Bz = 0, 
then L is the total differential of a function u defined in R: 
(75b) A = Uz, B= uy, C = Uz. 


For the proof, it is sufficient to show that the integral of L extended 
over any simple polygonal arc in R with initial point P’ and final point 
P” has a value that depends only on P’ and P” (see p. 97). We represent 
the two oriented arcs Co* and Cı* parametrically by, respectively, 


(76a) x = p(t), y=vwolt), 2=xt), OStS1 
and 
(76b) x=h(t), y=vwilt), z=7); OStS1 


with ¢ = 0 yielding P’ and t = 1 yielding P”. Using the simple con- 


1This follows from the fundamental theorem below and the fact that there exists a 
closed differential form, the one given by (71), whose integral over the whole circle 
does not vanish. 
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nectivity of R, we can “timbed” the paths (75a, b) into a continuous 
family! 


(76c) x= g(t, à) y=wt,A),  z=x(t, à) 


reducing to (76a, b) for à = 0, 1 and to P’, P” for t = 0, 1. We have by 
formula (56), p. 89. 


aed foL- fE 


Co* 
1 
= | [Ax + Byt + Cz)liz1 — (Axe + Bye + Czi)la-0] dt 
0 


where x, y, z are the functions of t, à given by (76c). We assume, to 
begin with, that those functions have continuous first derivatives with 
respect to ¢, à and a continuous mixed second derivative for 0 < t < 1, 
0 <A <1. Then by (76d) 


(ie) Jf L-J = f dt Í i (Axı + By: + Cz) da 


Now using the chain rule of differentiation and the integrability 
conditions (76a), we have the identity 


(Ax: + Bye + Cza = Axr + Bym + Cz + Arxrxt + Ayyrxt + Azzaxt 
+ Baxary:e + Byyrye + Bzzaye + Crxrzt 
+ Cyyn2e + Czzrzt 
= (Ax + Byr + Czar): 


Interchanging orders of integration (see p. 80), we find that 


J * L— J * L= fas (Ax. + Byr + Czy) dt = 0, 
1 0 


since x, ta, za vanish for t = 0, 1, because the end points are independ- 
ent of À. 

One sees the important part played in the proof by the assumption 
that R is simply connected. It enables us to convert the difference ot 
the line integrals into a double integral over some intermediate 
region. 

It is easy to remove the restrictions on the existence of derivatives 
of the functions ¢, y, x. Assume only that the arcs Co* and Ci* are 


1The paths of the family need not to be simple for à Æ 0,1. 
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smooth, that is, that the functions g(t, A), w(t, à), x(t, 4) have a continuous 
t-derivative when à has one of the values 0 or 1 while being continuous 
for other values of à. We can then (see p. 82) approximate these 
functions uniformly by functions ¢, ¥, %, which have continuous 
first derivatives with respect to ¢ and A and a continuous mixed second 
derivative. In order that the smoother functions obtained represent a 
deformation of the paths Co* and Cı* into each other, they have to 
agree with ¢, y, xy for à = 0, 1 and for t = 0,1. This can always be 
brought about by a slight modification of 4, y, ¥, by adding suitable 
terms so that 


x = g(t, A) — (1 — A) [$(t, 0) — gol(t)] — Mgt, 1) — g0] 
— (1 — t) [8, à) — go(0)] — tig, A) — go(1)] 
+ (1 — #) (1 — A) [4@, 0) — go(0)] + (1 — 8) [8(0, 1) — go(0)] 
+ (1 — A) [6(1, 0) — go(1)] + tA [SC 1) — go(1)] 


with analogous expressions for y and z. These functions have the 
correct values for à = 0, 1, and for t = 0, 1, have continuous first 
derivatives and mixed second derivatives, and can be made to 
approximate the original ¢, y, x so closely that the corresponding 
points (x, y, z) still lie in the open set R. 

Finally, the equality of the integrals of L can be extended to arcs 
Co* Cı* that are only sectionally smooth, e.g. to polygonal arcs, 
by approximating these arcs by smooth ones with the same end 
points. The integrals over the approximating smooth arcs all have 
the same values, and the same follows then in the limit for the 
integrals over Co* and Ci*. 


Appendix 


Geometrical intuition and physical reality always have provided 
powerful motivation and guiding ideas for constructive mathematical 
thought. Nevertheless, with the advance of analysis since the begin- 
ning of the nineteenth century, it has become a compelling necessity 
to cease invoking intuition as the prime justification of mathematical 
considerations. More and more, one has turned to rigorous proofs 
based on axiomatically hardened precision and clearly formulated 
concepts and procedures. In this development the notion of set, in 
particular of point set, has played a major role and by now has been 
absorbed into the fabric of analysis. Of some of these developments 
this appendix gives a simple introductory account. 
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A.l. The Principle of the Point of Accumulation in Several 
Dimensions and Its Applications 


To establish the theory of functions of several variables on a firm 
basis, we can proceed in exactly the same way as in the case of 
functions of one variable. It is sufficient to discuss these matters in the 
case of two variables only, since the methods are essentially the same 
for functions of more than two independent variables. 


a. The Principle of the Point of Accumulation 


We base our discussion on Bolzano’s and Weierstrass’s principle of 
the point of accumulation. A pair of numbers (x, y) may be represented 
in the usual way by means of a point with the rectangular coordinates 
x and y in an x,y-plane. We now consider a bounded infinite set of 
such points P(x, y), that is, a set containing an infinite number of dis- 
tinct points, all of them lying in a bounded part of the plane, so that 
|x| < Cand |y| < C, whereC isaconstant. The principle of the point 
of accumulation states that every bounded infinite set S of points has 
at least one point of accumulation. That is, there exists a point Q with 
coordinates (€, n) such that an infinite number of points of S lie in 
every neighborhood of Q, say, in every region 


(x — 6)? + (y — n)? < 8?, 


where 56 is any positive number. It follows that, out of the infinite 
bounded set of points we can choose a sequence of distinct points 
Pi, P2, P3,.. . that converges to a limit @. The sequence of the P; 
can be constructed by induction, giving 5 successively the values 1, 
4,4... .; we choose Pı arbitrarily in S; if Pi,..., Pn have been 
defined, we take for Pni1 any one of the infinitely many points in the 
set S that have distance < 1/(n + 1) from Q and are different from 
Q and from Pi, . . ., Pn. 

This principle of the point of accumulation for several dimensions 
can be proved analytically by the method used in the corresponding 
proof in Volume I (p. 95), merely by substituting rectangular regions 
for the intervals used there. An easier proof is obtained if we make use 
of the principle for one dimension. To do this we notice that by 
hypothesis every point P(x, y) of the set S has an abscissa x for which 
the inequality |x| < C holds. Either there is an x = xo that is the 
abscissa of an infinite number of points P (which therefore lie vertical- 
ly above one another) or else each x belongs only to a finite number 
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of points P. In the first case, we fix upon xo and consider the infinite 
number of values of y such that (xo, y) belongs to our set. These values 
of y have a point of accumulation for one dimension. Hence, we can 
find a sequence of values of y, say yı, y2, . . ., such that Yn > no, from 
which it follows that the points (xo, yn) of the set tend to the limit 
point (xo, No), which is thus a point of accumulation of the set. In the 
second case, there must be an infinite number of distinct values of x 
that are the abscissae of points of the set, and we can choose a se- 
quence xı, x2, . . . of these abscissae tending to a limit €. For each xn, 
let Pn = (Xn, yn) be a point of the set with abscissa xn. The yn form 
an infinite bounded set of numbers; hence, we can choose a sub- 


sequence Yn), Yno,.. . tending to a limit n. The corresponding sub- 
sequence of abscissae Xn,, Xno, . . . still tends to the limit €; hence, the 
points Pn,, Pno, . . . tend to the limit point (€, n). Thus, in either case, 


we can find a sequence of points of the set tending to a limit point, and 
the theorem is proved. 


b. Cauchy’s Convergence Test. Compactness 


A consequence of the Bolzano-Weierstrass theorem is that every 
bounded infinite sequence of points Pi, P2, . . . has a convergent sub- 
sequence. Indeed, if the sequence contains an infinite number of 
distinct elements, they form an infinite set of distinct points from 
which, according to the Weierstrass principle, we can choose a 
sequence converging to a point Q. If the sequence does not contain 
an infinite number of distinct elements, then at least one of its ele- 
ments must be repeated infinitely often; there exists then a point Q 
that appears infinitely often in the sequence, and the subsequence 
formed by elements that equal Q converges to the point Q. 

An important consequence is Cauchy’s convergence test: 


A sequence of points Pi, P2,.. . in the plane (and similarly a se- 
quence in n-dimensional euclidean space) converges to a limit if and 
only if for every € > 0 there exists a number N = N(e) such that the 
distance between Pn and Pm is less than € whenever both n and m 
are greater than N. 


The proof proceeds exactly like the corresponding one for se- 
quences of real numbers given in Volume I (p. 97). One sees im- 
mediately that a sequence satisfying the Cauchy condition is bounded; 
hence, by the preceding theorem, it contains a convergent sub- 
sequence with a limit Q, and it then follows immediately that the 
whole sequence converges to Q. 
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A set S of points in the plane was called closed if all boundary 
points of S belong to S. The limit Q of every convergent sequence of 
points of a closed set S is again a point of S (see p. 9). Since every 
bounded infinite sequence has been seen to contain a convergent 
subsequence of points, we find that every infinite sequence formed from 
points of a bounded and closed set S of points in the plane contains a 
subsequence that converges to a point of S. Generally we call a set S 
compact! if every sequence formed from elements of S contains a 
convergent subsequence with a limit in S. Hence, a closed and bound- 
ed set of points in the plane (or in n-dimensional euclidean space) is 
compact. The reader can easily verify the converse: Every compact 
set of points in the plane is closed and bounded. In the future we shall 
often refer to closed and bounded sets simply as compact sets. 


c. The Heine-Borel Covering Theorem 


A striking consequence of the Bolzano-Weierstrass principle is the 
Heine-Borel theorem: 


Let there be given a compact (i.e., closed and bounded) set S and a 
system >> of infinitely many open sets that cover S in the sense that 
euery point of S belongs to at least one of the open sets in >). Then we 
can find a finite number of sets in >| that already cover S. 

As an illustration consider the infinite set S of points on the x-axis 
consisting of the points Pn = (1/n, 0) for n = 1,2, . . . and of the origin 
Po = (0, 0). This is a closed set. For n = 1,2, . . ., let Sn denote the 
open disk 


1 
(x—1/n)* + y? < 372 


with center Pn and radius 1/3n?, and let So denote the disk 


—y——5 1 
Vx2 + y? < 100 
Clearly the infinite system of all sets So, S1, S2, . . . covers S. In agree- 
ment with the Heine-Borel theorem we can pick a finite subsystem that 
covers S, for example the system consisting of So, Si, . . ., S100. Here 
we immediately see the importance of the assumption that S be closed. 
The set T of points consisting of Pi, Po,... alone, without Po, is 
covered by the system consisting of Si, S2,..., but no finite sub- 


1Sometimes more precisely ‘sequentially compact.” 
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system of these sets, each of which contains only a single point of T, 
can cover T. | 

To prove the Heine-Borel theorem, we use an indirect argument. 
Suppose that the theorem is false. The set S, being bounded, lies in a 
square Q. This square we subdivide into four equal squares. The part 
of S lying in at least one of these four squares or on its boundary 
cannot be covered by a finite number of the sets in 5’; for if each of 
the four parts of S could be covered in this way, S itself would be 
covered. This part of Q we call Qi. We now subdivide Qı into four 
equal parts. By the same argument one of the four parts of Qı is a 
square Qə such that the points of S lying in Qe or on its boundary 
cannot be covered by a finite number of the open sets in >) . Continu- 
ing in this way, we obtain an infinite sequence of squares Qı, Qz, 
Qs, . . . each contained in the preceding one, their size shrinking to 
0, and such that the points of S in the closure of any Qn cannot be 
covered by a finite number of the sets in >) . Clearly, for each n we can 
find a point P, of S that lies in the interior or onthe boundary of Qn. 
Then Pı, P2, . . . is a sequence of points of S. Since S is bounded, the 
sequence is bounded and must have a subsequence converging to some 
point A. Since S is closed, A is a point of S and hence contained in an 
open set 2 belonging to >|. But then a whole neighborhood of A 
belongs to that open set Q, say, the neighborhood consisting of the 
points having distance less than e from A. We can choose an n so large 
that Pn has distance less than ¢/2 from A and that the diagonal of 
Qn has length less than ¢/2. Then the whole square Qn is contained in 
the e-neighborhood of A and hence also in Q. We see that the single 
set © of the system 5} contains a whole square Qn and its boundary, 
contrary to the assumption for the sequence Qn. This completes the 
proof. 


d. An Application of the Heine-Borel Theorem to Closed Sets 
Contained in Open Sets 


Let R be an open set in the plane.! By definition every point P of R 
has a neighborhood that lies completely in R. For points P close to 
the boundary of R the neighborhood has to be very small. It is re- 
markable that for P confined to a closed subset S of R we can find a 
uniform size for the neighborhoods that are contained in R: 


If a closed and bounded set S is contained in an open set R, there 
exists a positive € such that the e-neighborhood of every point P of S 


1Everything said in this paragraph applies equally well to higher dimensions if we 
substitute the term “ball” for “disk.” 
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is contained in R. In other words, the points not in R lie at least a 
distance £ away from all points of S. 

For the proof we make use of the assumption that R is open. For 
every point P in R there exists a disk with center P that is contained 
in &. The radius of this disk, call it r, depends on P; that is, r = r(P). 
We take now for any Pin S the open disk of radius 4+7r(P) and center 
P. By the Heine-Borel theorem a finite number of these disks can be 
found that cover the compact set S. Thus, we can find a finite number 
of points Pı, . . ., Pn in S such that every point P of S is contained in 
one of the disks of center Px and radius +r(Px) fork = 1, . . ., n. Lete 
be the smallest of the positive numbers 4+r(Pi),.. .,47r(Px). Then, for 
every Pin S, the -neighborhood of P lies in R, for P lies in some disk 
of center Px and radius +r(Px ). By construction the concentric disk 
D of radius r(Px) lies completely in R. Since PP; < 4+r(Pr) and £ < 
+1r(Px), the disk D contains the disk of radius £ about P. This shows 
that the disk of radius £ and center P lies in R. 

As an example, we consider a curve S lying in the open set R. Such 
a curve is a set of points P = (x, y) that can be represented in the form 


x = g(t), y= v0) 


with the help of two continuous functions ¢ and y, where the para- 
meter ¢ varies over a closed interval 0 < £ < 1.2 Such a curve Sisa 
closed point set, for let Pi, Pe, . . . be a sequence of points on S con- 
verging to a point P. We consider the corresponding parameter values 
ti, t2,. . ., which all lie in the closed interval a < t < b. Since a 
closed bounded interval is compact, a subsequence of the tn converges 
to a value ¢ in the interval. Since ¢ and wy are continuous, the cor- 
responding Pr converge to the point Q = (x(t), y(é)) on S. Thus, a sub- 
sequence of the sequence Pı, P2, . . . converges to a point Q of S. 
Since the whole sequence converges to P, we have P = Q. and hence, 
P lies in S. Thus, S contains all limits of sequences of points of S and 
hence is closed. 

If the curve lies in the open set R, we can find a positive number € 
such that all disks of radius £ with centers on S lie in R. Since f and’ g 
are continuous, and hence uniformly continuous, we can find a 
positive number ô such that two points on S have distance less than 
g if their parameter values ¢ differ by less than 5. We can divide the 


1It is essential that S is bounded. If, for example, R is the open half-plane y > 0 and 
S the closed set consisting of the points in the x,y-plane with y > 1/x, x > 0, the 
boundary of R comes arbitrarily close to points of S. 

“The curve need not be simple; that is, different t may correspond to the same point 
P. The pair of functions defines a “path,” and S is the support of that path. 
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parameter interval by points ti, . . ., tn-1 such that 
a = tlo<ti <t < » e e <tn-1 <tn = 


where the length of every subinterval is less than 5. Let Po, Pi, . . ., 
Pnbe the corresponding points on S. Then P;:1 always lies in the disk 
of radius £ about P;. Also, the straight line segment joining P; and 
Pi+ı lies completely in the disk of radius £ and center P;, and hence 
is contained in R. If we join successive points P; by straight line 
segments, we obtain a polygonal curve that lies completely in R and 
has the same end points Po, Pn as the continuous curve S. We can 
formulate this result as follows: 


If two points of an open set R can be joined by a curve that lies in R, 
then they can also be joined by a polygonal curve in R. 


A.2. Basic Properties of Continuous Functions 


For functions f defined and continuous in a closed and bounded set 
S we can state the following two fundamental theorems: 


The function f assumes a greatest value (“maximum”) and a least 
value (“minimum”) in S. 


The function f is uniformly continuous in S. 

The proofs of these theorems are like the corresponding proofs for 
functions of one variable (see Volume I, pp. 100-101) and need not be 
repeated. 

The second theorem can also be obtained as an immediate con- 
sequence of the Heine-Borel theorem. Prescribe an ¢ > 0. If fis con- 
tinuous at every point of S, there exists for every point P in Sa ô- 
neighborhood of P of a certain radius 6 = 6(P) such that |f(Q) — f(P)| 
< 6/2 for any Q in S that lies in that neighborhood. Now for each 
P in S choose a neighborhood Qp of radius 46(P). The Qp clearly 
cover S. We can select a finite number of them, say those with centers 
Pi, . . ., Pn that also cover S. Let A be the smallest of the numbers 
4+8(Pi),..., +6(P,). If then P and Q are any two points of S whose 
distance is less than A, the point P has distance less than +4 65(Px) 
from one of the points Px with k= 1,.. ., n. Since A < 46(Px), we 
see that both P and Q lie in the 6(Px)-neighborhood of Px. Hence, 


AP) -APAI < Fe, IRQ- MP) < 3e, 


and thus 
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IfP) — FQ) < e. 


This establishes the uniform continuity of f since A is independent 
of the particular location of P and Q. 


A.3. Basic Notions of the Theory of Point Sets 


a. Sets and Subsets 


In more complicated arguments involving sets of points (particu- 
larly in the theory of integration) it 1s convenient to use some stand- 
ard notations for operations with sets. The sets of interest to us are 
always sets of numbers, of points, of functions, or of sets of these 
types. For example a “disk” in the plane is defined as a set of points 
(x, y) for which 


(x — x0)? + (y — yo? < r? 


for fixed xo, yo, r. An example of a set of sets (or family of sets) would 
be that consisting of all disks that contain the origin; that would be 
those disks for which xo? + yo? < r?. 

We shall refrain from trying to reduce the basic notion of set to 
still more fundamental ones or to analyze the logical difficulties in- 
volved in this notion. For us a set S is defined if for every object a ex- 
actly one of the two following statements is correct: (1) a belongs to 
S; (2) a does not belong to S. In case (1) one also says that a is an ele- 
ment of S or that a is contained in S; symbolically! one denotes this by 


a ES, 
and case (2) by 


ad S. 


For example, if S is the disk given by the inequality x? + y? < r?, 
then a € S means that a is a point in the plane with coordinates x, y 
that has the property that x? + y? < r?. Generally the elements of a 
set S can be characterized by some common properties (e.g., by the 
property of belonging to S). We write the set S of elements a that have 
the properties A, B, . . . symbolically as 


S = {a: a has the properties A, B,.. .}. 


1The symbol € must not be confused with the Greek letter e. 
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For example, the disk S with center (xo, yo) and radius r can be de- 
scribed as 


S = {(x, y): x, y = real numbers; (x — xo)? + (y — yo)? < rẹ}. 
The set described by 
S = {n : n = integer; 2<n< 5} 


consists of the two elements n = 3 and n = 4. 

For many purposes it is convenient to introduce the ‘‘empty” (or 
“null”’) set with the special symbol Ø. This set has no elements: 
a ¢ Ø for all a. For example an open disk of radius 0 and center at the 
origin coincides with @: 


{(x, y) : x, y = real numbers; x? + y2? < 0} = Ø. 


Two sets S and T are equal when they have the same elements, 
regardless of the different descriptions or properties used in their de- 
finition: S = T means that x € S if and only if x € T. 

A set S is said to be a subset of a set T (“S is contained in T”’) if T 
contains all the elements that are contained in S, that is, if a € S 
implies a € T. We write this symbolically: 


SCT 
or, more rarely, 
TDS. 


Thus, if S is the disk of radius 1 about the origin and T the disk of 
radius 4 about the point (1, 1), then S C T. Similarly, Ø C Sand SC S 
for all sets S. 

The symbols C and D are chosen, of course, for their similarity to 
the < and > signs of arithmetic (or more precisely to the < and > 
signs). They share with the latter symbols the basic properties: 


SCT andl TCS implies S=T 
SCT and TCR implies Sc R. 
This is the common syllogism from logic: If all objects with the property A have the 


property B and all objects with the property B have the property C, then all objects 
with the property A have the property C. 
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A basic difference between the “contained in” signs for sets and the 
order signs for numbers is that for real numbers we always have either 
x <yory< «x, whereas for sets neither of the propositions S C T or 
Tc 8S has to hold. The symbol C defines only a “partial”? ordering 
between sets; of two sets neither may contain the other one. 


b. Union and Intersection of Sets 


During the last decades a great number of logical symbols have 
found wide acceptance in mathematics, so that it is now customary to 
express many mathematical theorems completely in symbolic nota- 
tions without the use of ordinary words or sentence structure.! Use of 
proper symbolic notation has been essential for the development of 
mathematics from the very beginning; in fact, in rare instances, pro- 
gress in some field may have slowed down for centuries just for lack 
of a suitable notation, as was perhaps the case with algebra in an- 
tiquity. On the other hand, too concentrated a notation may prove a 
great strain to the reader who tries to relate the information in the 
“dehydrated” form to his ordinary experience. Authors of books not 
primarily devoted to logic and foundations of mathematics compro- 
mise on the use of logical abbreviations in accordance with their 
tastes and the requirements of the special subjects under considera- 
tion. 

There are two further set-theoretical symbols that we shall find al- 
most indispensable later in this book, namely, the symbols for the 
operations of “union” and “intersection” of sets. Given two sets S and 
T we write S U T for the “union” of the two sets, that is, for the set of 
elements that are “either” in S “or” in T: 


SU T= {a:a&ES orae T}. 


Similarly, the “intersection” S N T of S and T is defined as the set of 
elements that belong to both S and T: 


SQ T= {a:a€&S and a€ T}. 


1Examples of frequently used symbols follow: 

{x1, x2, . . ., Xn}: the set whose members are precisely x1, . . ., Xn 

S x T: the set of ordered pairs (a, b) with a € S and b € T (“Cartesian product” 
of the sets S, T) 

—: “implies” 

3x: “there exists an x” 

vx: “for all x.” 

2Here the word “or’’ like the Latin vel is not exclusive. S U T consists of the elements 
that belong to at least one of the two sets S, T but may belong to both. 
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For example, if S and T are intervals on the real number axis and if 


S= {x:3<x< 5}, 
T= {x:4<x< 6}, 


then 


SU T= {x:3<x<6 
SO T= {x:4<x"<5} 


The operations U and f apply to any two sets S and T, provided we 
use the symbol for the empty set, writing 


SO T=@ 


when S and T are disjoint, that is, have no common element. Notice 
that S U Ø = S, S N Ø = Ø for any S. 

The operation U has many properties in common with addition. In 
particular, if S and T are “disjoint” sets—that is, sets without com- 
mon elements—and have finitely many elements, then the number of 
elements in S U Tis just the sum of the numbers of elements in S and 
in T. There is, however, generally no unique inverse operation to 
union. Only if S and T are assumed to be disjoint and S C R, does the 
equation 


SUT=R 


have a unique solution T. For disjoint sets S, T the union is often 
denoted by S + T, andfor S C R, the solution T ofthe equation S + T 
= R by R — S (“the complement of S relative to R”). We shall use 
the symbol R — S more generally for any sets R, S to denote the set of 
elements of R that do not belong to S. Then S+ (R—S)=RUS. 

The union of n sets S1, . . . ., Sn is defined as the set of elements 
belonging to at least one of the sets Si, . . ., Sn and is variously de- 
noted by 


{a:a E Si or a E S& or... or a E Sn} 
= M&U S&U. ° > U Sn 


Ü Sr 
k=1 


in analogy to the summation and product symbols. Similarly, the in- 
tersection of thesets Sı, . . ., Sn, defined as the set of elements com- 
mon to all of them, is 
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{a:a E Si and a € Se and... anda E Sn} 


= 819 Sef - ° N Sa= fÀ Se. 


We can with equal ease form unions and intersections of an infinite 
number of sets Si, S2, . . ., Sn, . . ., which we write respectively as 


U Si = {a : a E Sn for some n} 
=1 


Sx = {a:a E Sn for all n}. 
1 


TDs 


For example, if Sn is the set of real numbers x < n 
Sn = {x : x real, x < n}, 


we have 

U Sz = {x: x real} 

A Sk = {x : x real, x< 1}. 
=1 


In fact, union and intersection can be formed for arbitrary large 
families F of sets S even where the different sets S in F are not, or 
cannot be, distinguished by a subscript n with n = 1, 2, 3,... 
We write 


U S= {a:ae€S for some S with SE F} 


SEF 


N S= {a:a E8 for all S with Se F}. 
SEF 


Thus the union of all disks in the x, y-plane containing the point (1, 0) 
but not the point (—1, 0) is the set of all (x, y)for which either y Æ 0 
or y = Qand x > —1. The intersection of the same family of disks con- 
tains the single point (1, 0). 


c. Applications to Sets of Points in the Plane 


Some of our earlier results and definitions (see pp. 6-8) can be 
rewritten more compactly in the notation introduced in the last sec- 
tions. Thus, given a set S of points in the plane, we obtain a decomposi- 
tion of the whole plane r into three disjoint sets, namely, the set S° 
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of interior points of S, the set 3S of boundary points of S, and the set 
Se of exterior points of S: 


r= S U ƏS U Se 
or more precisely, 

T = S + 3S + Se 
Since the sets are disjoint: 


S° N IS = 9S N) Se = Se N S = Ø. 


Here 

SOC SCA + aS. 
The set S defined by 
(1) S=S8°+aS=SUaS 


is the closure of S. We have S° = S for open S and S = S for closed S. 
The reader may verify as exercises the following propositions: 


0S = 0S (“The boundary of a set is always closed.”) 
S = S (‘The closure of a set is always closed.’’) 
(S0)? = S0, (Se)? = Se (“The sets S} and Se are open.’’) 


2(a) SUMC(SUTY, SUTCSUT 
Xb) ASU T)CaS UAT 


The union of open sets is open. 

The union of a finite number of closed sets is closed. 
The intersection of a finite number of open sets is open. 
The intersection of closed sets is closed. 


The last statements indicate a kind of symmetry (“duality”) 
between the notions “open” and “closed,” “union” and “intersec- 
tion.” This becomes more precise if we introduce the complement C(S) 
of a set S, that is, the set of points in the plane x not belonging to S:1 


C(S) = {[P: Pen, PE S} =r- 8S. 


1For sets S of points on three-space >) the complement of S is defined as >; — S, the 
set of points of >; not belonging to S. 
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We have 
C(S°) = Sz, dC(S) = aS, C(S.) = S°. 


If S is open, C(S) is closed, and vice versa. The complement of the 
intersection of several sets is the union of their complements. 

In this notation the theorem of Heine-Borel takes a particularly 
simple form. “A family F of sets covers a set S” means simply that S 
is contained in the union of the sets of F. The theorem then simply 
states: 


If F is a family of open sets in the plane and if S is a bounded and 
closed set such that 


Sc UT, 
TEF 


then we can find a finite number of sets Tı, T2, . . ., Tn E F such that 


A.4. Homogeneous Functions 


The simplest homogeneous functions occurring in analysis and its 
applications are the forms or homogeneous polynomials in several 
variables (see p. 13). We say that a function of the form ax + byisa 
homogeneous function of the first degree in x and y, that a function of 
the form ax? + bxy + cy? is a homogeneous function of the second 
degree, and in general that a polynomial in x and y (or in a greater 
number of variables) is a homogeneous function of degree h if in each 
term the sum of the exponents of the independent variables is equal to 
h, that is, if the terms (apart from constant coefficients) are of the 
form x", xh-ly, xh-%y2, . | ., y”. These homogeneous polynomials have 
the property that the equation 


f(tx, ty) = f(x, y) 


holds for every value of t. More generally, we say that a function 
f(x, Y, . . .) is homogeneous of degree h if it satisfies the equation 


f(tx, ty,...)=tf(%,y,..-). 


Examples of homogeneous functions that are not polynomials are 
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tan(*} (h = 0), 


x? sin 5 + yvx2 + y? log* +2 (h = 2). 


Another example is the cosine of the angle between two vectors with 
the respective components x, y, z and u, v, w: 


xu + YU + zw (h = 0) 
Vx? + y? + 22 Ju? + v2 + uw? oO 


The length of the vector with components x, y, z, 
Jt yF 


is an example of a function that is positively homogeneous and of the 
first degree; that is, the equation defining homogeneous functions 
does not hold for this function unless ż is positive or 0. 


Homogeneous functions that are also differentiable satisfy Euler’s 
partial differential equation 


xfs + Yfy + zfz + > + > =hf(x,y,z,.. .). 


To prove this we differentiate both sides of the equation f(tx, ty, . . .) 
= thf(x,y,. . . ) with respect to t; this is permissible, since the equa- 
tion is an identity in t. Applying the chain rule to the function on the 
left, we obtain 


xfeltx, ty,. ..) + yfyltx, ty,...) ++ + + =Atif(x,y,...). 


If we substitute t = 1 in this, the statement follows. 

Conversely, it is easy to show that the homogeneity of the function 
f(x,y, . . . )is a consequence of Euler’s relation, so that Euler’s relation 
is a necessary and sufficient condition for the homogeneity of the func- 
tion. The fact that a function is homogeneous of degree h can also be 
expressed by saying that the value of the function divided by x? de- 
pends only on the ratios y/x, z/x, . . .. Itis therefore sufficient to show 
that it follows from the Euler relation that if new variables 


are introduced, the function 
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L fey 2 ) = ga f& nb, t, )=e(&n,G.--) 


no longer depends on the variable & (i.e., that the equation ge = 0 is 
an identity). In order to prove this, we use the chain rule: 
1 h 
8. = (fet nft. > den T gmail 


1 h 
= (xfz + yfy+ *) ai T pail 


The expression on the right vanishes in virtue of Euler’s relation, and 
our statement is proved. 

This last statement can also be proved in a more elegant, but less 
direct, way. We wish to show that from Euler’s relation it follows that 
the function 


a(t) = tf(x,y,...) — f(tx, ty,..-) 
has the value 0 for all values of t. It is obvious that g(1) = 0. Again, 
g(t) = hth f(x,y, . . .) — xfatx, ty,.. .) — yfaltx, ty,...) =... 


On applying Euler’s relation to the arguments tx, ty, . . . we find that 


h 
xfx({tx, ty, . . ) + yfy(tx, ty, os .) + . °. > =q f(tx, ty, e. ), 


and thus g(t) satisfies the differential equation 
, h 
g6) = sl), - 


If we write g(t) = y(t”, we obtain g'(t) = g(t) + t'y'(t), so that y(t) 
satisfies the differential equation | 
t*y'(t) = 0, 


which has the unique solution y = constant = c. Since for t = 1 it is 
obvious that y(t) = 0, the constant c is 0, and so g(t) = 0 for all values 
of t, as was to be proved. 


CHAPTER 
2 


Vectors, Matrices, 
Linear Transformations 


Vectors in two dimensions have already been studied in Volume I, 
Chapter 4. Geometric concepts in higher dimensions make the use of 
vectors even more essential. Vectors serve to express many com- 
plicated equations concisely in a manner clearly exhibiting those fea- 
tures that do not depend on a particular choice of coordinate systems. 


2.1 Operations with Vectors 


a. Definition of Vectors 


We introduce vectors in n-dimensional space as entities that can be 
added to each other and multiplied by scalars. Specifically, a vector 
A is a set of n real numbers! ai, . . ., an in a definite order 


A = (a1, . . ., Qn) 


(We always employ boldface type to denote vectors.) The numbers 


ai1,. . . , @n are called the components of A. Two vectors A = (a1,. . . , 
an) and B = (61,. . ., bn) are equal if and only if they have the same 
components. 

The sum of any two vectors A = (ai, . . .,@n)andB=(6i,. . ., bn) 


is defined by 
(la) A + B = (ai + bi, az + be, . . ., an + bn); 


1For our purposes it is sufficient to consider only real numbers as components, al- 
though vectors over other number fields also are used in other contexts. 
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we define the product of the vector A = (a1, . . ., an) by the scalar 
(i.e., real number) A as 

(1b) NA = (Adi, Ade, . . . , Adn).} 

More generally, we can form from any finite number of vectors A = 
(a1, a2, . . ., an), B = (bı, be, . . ., On), . e.. D = (di, d2, . . . , An) 
and an equal number of scalars A, u, .. ., y the linear combination 
AA +yB+- +--+ yD= (ai + pbi +» + + + ydi,..., Adn + pbn 
++ + + +ydn). In particular, any vector A = (a1, . . . , an) can be 


represented as a linear combination of the n “coordinate vectors” 


(2a) Eı = (1,0,0,...,0), Ee = (0,1,0,...,0),..., 
En = (0, 0, 0, e.. , 1). 
Obviously, 
(2b) A = aiFi + a@a2E2 + + + + + anEn. 
We use the symbol 0 for the “zero vector,” all of whose components 
vanish: 0 = (0, 0,..., 0). We write —A for the vector (—1)A = 
(—aı, —@2,..., —Qn). 


It follows trivially from these definitions that sums of vectors and 
products with scalars obey all the usual algebraic laws, as far as they 
are meaningful.” Examples of objects conveniently represented by 
vectors are furnished by functions that are linear combinations of a 
finite number of suitably chosen functions. Thus, the general poly- 
nomial of degree < n in the variable x 


1Vectors differ from other objects that can be described by an ordered set of n real 
numbers (e.g., points in n-dimensional euclidean space or on a sphere in n + 1 di- 
mensions) just by the fact that they permit the “linear operations” A + Band dA. 
Addition of points defined similarly in terms of their coordinates would have no 
geometric meaning, at least no meaning independent of the special coordinate 
system used. Vectors will be represented later by pairs of points (see p. 109). 

These laws are the following: 

(1) A+B=B+A,A+(B+0O=(A4+B4+C 

(2) MA + B) = àA + AB, (A+ WA=AA + pA, (AWA = MHA) 

(3) There exists a unique element O such that A + O = A forall A 

(4) There exists a unique element —A for given A such that A+ (—A) =0 

(5) 0A = O, 1A = A for all A. 

Generally, sets of objects for which addition of the objects and multiplication by 
scalars are defined, and obey these laws, are called vector spaces. 
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P(x) = ao + aix + a2x? + + © © + anx”, 


can be represented by the single vector A = (ao, a1,. . . ,@n) in (n + 1)- 
dimensional space. Addition of vectors and multiplication by 
scalars correspond then to the same operations carried out for the 
polynomials. Similarly, the general nth degree trigonometric poly- 
nomial 


f(x) = 5 ao + È (ax cos kx + bx sin kx) 


(see Volume I, p. 577) can be represented by the vector (ao, a1, ..., 
an, bı, b2, . . . , bn) in (2n + 1)-dimensional space. The general linear 
homogeneneous function of three variables 


u = A1X1 + A2X2 + A3X3 


is represented by the vector (a1, dz, a3) in three-dimensional space, 
and the general quadratic form in three variables 


u = 1x12 + a2x22 + a3x3? + 2aaxex3 + 2as5sx3xı + 2a6x1%X2 
by the vector (a1, az, a3, @4, a5, ae) in six-dimensional space. 


b. Geometric Representation of Vectors 


Vectors in n-dimensional space, just as in the plane, can be visual- 
ized geometrically as certain mappings of space, the translations or 


parallel displacements. The vector A = (a1, d2,..., @n) may be 
depicted as the translation of n-dimensional euclidean space R” that 
maps any point P = (x1, x2, . . . , Xn) into the point P’ = (x1', x2,..., 


xn’) with coordinates 
(3a) x1’ = X1 + Q1, Xe! = X2 + A2, . . ., Xn! = Xn + Gn. 


The translation or the corresponding vector A is determined 


uniquely if for a single point P = (x1, x2, . . . , Xn) we give the image 
P’ = (x1', x2’, ..., Xn’); obviously by (8a) 
(3b) A = (xy — xı, X2 — X2, ..., Xn!’ — Xn). 


1Jt is understood that both points P and P’ lie in R” and that their coordinates are 
taken with respect to the same coordinate system. 
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We shall denote this translation by A = PP’ and say that the vector 
A is represented by the ordered pair of points P and P’ We call P the 
initial point and P’ the end point or final point in this representation. 


In drawings the vector A = PP’ usually is indicated by an arrow 
extending from P to P’. The same vector A has many representations 


A = PP’ by a pair of points P and P’. The initial point P is completely 
arbitrary, since the mapping defined by A can act on any point and 
then determine an image P’.! The zero vector 0 corresponds to the 
“identity mapping” in which each point is mapped onto itself: 0 = 
PP. 

As in the planar case (Volume I, p. 384) the sum of two vectors 
A = (a1,..., an), B = (bi,..., bn) yields the symbolic product 
of the corresponding mappings. If A takes the point P = (m,..., 
Xn) into the point P’ = (x1, . . . , Xn’) and B takes the point P’ into 
P” = (x1",.. . , Xn’), then C = A + B corresponds to the translation 
that takes P into P”, since 


xi” = xe! + bi = (xi + ai) + bi = xi + (a + bi) 


fori =1,...,n. In vector notation we have 
(4) A + B = PP’ + PP” = PP”. 


If we represent B in the form PP” giving it the same initial point 


P as A, we find that A + B = PP” is represented by the diagonal of 
the parallelogram with vertices P, P’, P”, P” (see Fig. 2.1). 


Figure 2.1 Addition of vectors. 


A——————— EEE EEE an 
1Occasionally the notation P’ — P is used for the vector PP’, which, in accordance 
with formula (8b), suggests the notion of vectors as differences of points. 
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Interchanging initial and end point of the vector A = PP’ = 


(x1’ — x1, X2? — xX2,...,Xn' — Xn) leads to the opposite vector 
———m 
P'P = (x1 — x1',X2 — X2',...,%n — Xn’) = (—1) A = —A. 


The mapping P’ > P corresponding to — A is the inverse to the mapping 
A; carrying out first A and then — A results in the identity mapping in 
accordance with the formula 


(-A)+A=(-1+1DA=0A=0. 


Corresponding to (4) we have the often used formula for the difference 


of two vectors A = PP’ and B = PP” with common initial point: 
(4a) B- A = PP — PP = PP” + PÈ = PÈ + PP" = PP". 


The difference of the vectors PP” and PP’ is here represented by the 
third side of the triangle with vertices P, P’, P”. 

We can associate with every point P = (x1,..., Xn) a unique 
vector that has the origin as initial point and P as end point; this is 
the vector 


OP = (x1, . 2 ey Xn), 


the so-called position vector of P. The components of the position 
vector of P are just the coordinates of P. For example, the coordinate 
vector E: = (0,. .., 0, 1, 0,. . . , 0) in formula (2a) is the position 
vector of the point on the positive x-axis that has distance 1 from the 


n 
Figure 2.2 The vector PP’ as difference of position vectors. 
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origin. Any vector A = PP’ can always be written as the difference of 
the position vectors of its end point and initial point: 


(5) PP’ = OP’ — OP 
(see Fig. 2.2). 


c. Length of Vectors, Angles Between Directions 


The distance between two points P = (xı, . . . , Xn) and P’ = 
(x1, . . ., Xn’) in n-dimensional euclidean space R” is given by the 
formula! 


(6) r= V(x — x1)? + (x2! — x? + > > © Fan — Xn) 


Since only the differences of corresponding coordinates of P, P’ enter 
into the expression for r, we see that the distance is the same for all 


pairs of points P, P’ that represent the same vector A = PP’ . We call 


r the length of the vector A and write r = |A|. The vector A = (a1,.. ., 
an) has the length 

(6a) |Al=Va + ah bs > Page 

The zero vector 0 = (0, 0,..., 0) has length 0. The length of any 


other vector is a positive number. 

In euclidean geometry, angles can be expressed in terms of lengths. 
This is achieved by the trigonometric formula (“law of cosines”) that 
gives in a triangle with sides a, b, c the angle y between the sides a 
and b: 


_ @+0—c 

(6b) cos Y= — sab ` 

We apply this formula to a triangle with vertices P, P’, P”. (Fig. 2.3a). 
The sides a and b of the triangle are the lengths of the vectors A = 


PP’, B = PP”, while side c is the length of the vector 


1In two or three dimensions the formula can be derived geometrically by applying 
the theorem of Pythagoras. In higher dimensions the expression for r can be con- 
sidered as the definition of distance between two points in n-dimensional euclidean 
space, when referred to a Cartesian coordinate system. 
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(a) (b) 


Figure 2.3 Vector representation of a line through a given point with 
a given direction. 


c = PP" = PP" — PP’ =B-A. 


For 

A=(a1,...,@n), B=(bi,..., bn) 
we have 

C= (cı, . . . , Cn) = (bı — a1, . . . , bn — Gn). 
By (6b) 

cos y = AL HBE -IOE 

2|A| [BI 

where 


JAP = Šat, [BIt= SbF 101? = Ži (b — a). 
= i= = 
Thus, for A + 0, B 4 0, 


__ abi + aaba tt tt anbn 
cos Y= SaF. e tant Vo + e+ + + nt 


(7) 


We see that the angle y in the triangle PP’P” depends only on the 
vectors A = PP’ and B = PP”. Accordingly, we call the quantity cos y 


Vectors, Matrices, Linear Transformations 129 


given by formula (7) the cosine of the angle! between the vectors 
A =(qa1,...,@n) and B = (bi, . . . , bn). 

Formula (7) for cos y actually always defines real angles y 
between any two nonzero vectors A, B, since it always yields a value 
with |cos y|< 1. This is an immediate consequence of the Cauchy- 
Schwarz inequality (Volume I, p. 15) 


(8) (aıbı + azb2 +e e.e «© + Anbn)? 
< (a1? + a2? + ¢ © © +n?)(b12 + b22 + © © © + bn?) 


In computing the angles between the vector A and any other 
vector B from (7), we need to know only the quantities 


Qi 


= maam t=1,...,n 
Vaz ee + ae? 


(9) E 


which are called the direction cosines of A. All nonzero vectors 
with the same direction cosines form the same angles with other 
vectors and thus can be said to have the same direction. It follows 
from (7) that the direction cosines of A can be interpreted as cosines 
of certain angles: 


(10) Éi = COS 04, 


where a; is the angle between A and the ith “coordinate vector” 
E: = (0,...,0,1,0,. . ., 0). The n direction cosines of the vector 
A satisfy the identity? 


(11) cos? a1 + cos? a2 ++ * e Cos? An = 1. 


The only vector without direction cosines (and thus without a direction) 
is the zero vector. 

Two vectors A and B not equal to 0 have the same direction if and 
only if they have the same direction cosines, that is, if 


The angle y itself is determined uniquely only if we confine y to lie in the interval 
0 S y <r. Replacing y by 2nn + y (where n is an integer), we obtain all other 
angles with the same value of cos y, and any of these will be considered as an angle 
between A and B. 

2In two dimensions the relation cos? a1 + cos? az = 1 permits us to choose for az 
the value 1/2 — aı. In three or higher dimensions the relation (11) between the 
direction cosines does not correspond to any simple linear relation between the 
angles ao; themselves. 
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Clearly, this is the case if and only if A and B satisfy a relation A = 
AB, where à is positive. Here 1 = |A|/|B| is the ratio of the lengths 
of the vectors. A vector of length 1 is called a unit vector. The vector 


Cn.. oén TAJA 


whose components are the direction cosines of A is the unit vector in 
the direction of A. 

The vector — A = (—aı, . . . , —@n) opposite to A has the direction 
cosines —č&:. We call its direction opposite to that of A. Two vectors 
A and B neither of which is the zero vector will be called parallel if 
they either have the same or the opposite directions. It is necessary 
for parallelism then that A = AB where à is any number + 0. The 
components ai,..., Qn of any vector A + 0 parallel to a given 
direction are called direction numbers for that direction. 

If we assign to a unit vector (%1, . . . , n) the origin O as initial 
point, the end point P = (&1,. .., n) is a point on the “unit 
sphere” (i.e., the sphere of radius 1 and center at the origin O) €1? + 
Eo2 +. + » + Én? = 1. Since there exists exactly one unit vector in 
any given direction, we see that the different directions in n-di- 
mensional space can be represented by the points of the unit sphere. 
The points on the sphere corresponding to opposite directions are 
diametrically opposite. 

Intuitively a straight line can be thought of as a curve of “constant 
direction”. This suggests that a straight line in n-dimensional space 
be defined as a locus of points with the property that all vectors 4 0 
with initial and end point on the line are parallel. This definition leads 
immediately to a vector representation for lines. For any distinct 


points P, Q on the line L the vector PQ is parallel to a fixed vector A, 
that is, 


PQ = 1A (. £ 0). 


If we keep P and A fixed and let Q run through all points of the line 
L we have for the position vector of Q the formula (see Fig. 2.3b) 


(12) OÒ = OP + PQ = OP + 1A. 
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Here the parameter à varies over all real values; the value à = 0 
corresponds to the point Q = P. If Q has coordinates x1, ..., Xn; 
P, the coordinates yı, . . . , Yn; and A, the components aj, . . . , Qn, 
formula (12) corresponds to the parametric representation of the line 


xi = yi + hai @=1,...,n) 


where the parameter à varies over all real à. The point P divides 
the line Z into two half-lines, or “rays,” distinguished by the sign 


of à. For à > 0 the vector PQ has the same direction as A (“points”’ 


in the direction of A); for à < 0 the vector PQ points in the opposite 
direction. 


d. Scalar Products of Vectors 


The quantity appearing in the numerator of formula (7) for the 
angle y between two vectors A = (a1, . . . , @n) and B = (bi, . . . , bn) 
is called the scalar product of A and B and denoted by A - B: 


(18) A - B = aıbı + azb2 + + + + + anbn. 
Expressed in terms of geometric entities it can be written as 
(14) A-B=|A| |B| cos y. 


The scalar product of two vectors is the product of their lengths 
multiplied with the cosine of the angle between their directions. If 
A = PP’ ,B= PP”, we can interpret p = |A| cos y geometrically as 
the (signed) projection of the segment PP’ onto the line PP” (see Fig. 
2.4). We call p the component of the vector A in the direction of B. By 
formula (14) we have 


(14a) A- B=p]|B]. 


Thus the scalar product of the vectors A, B is equal to the component 
of A in the direction of B multiplied by the length of B.! If B is the 
coordinate vector E; = (0,..., 1,...0)in the direction of the 
positive x-axis, the component of A in the direction of B is simply 
ai, the ith component of the vector A. One easily verifies from the 


1It is, of course, also equal to the component of B in the direction of A multiplied by 
the length of A. 
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Figure 2.4 Scalar product of the vectors A=PP and B=PP”., 


definition (13) that the scalar product satisfies the usual algebraic 
laws 


(15a) A-B=B-A (commutative law) 
(15b) MA - B) = (AA) > B= A- (AB) (associative law)! 


(15c) A-(B+C)=A-B+A-C, (A+ B)-C=A-C+B-C 


(distributive laws). 


The fundamental importance of the scalar product stems from the 
fact that, expressed in terms of the components of the vectors A and 
B, it has the simple algebraic expression (13), while at the same time 
it has a purely geometric interpretation represented by formula (14), 
which makes no mention of the components of the vectors in any 
specific coordinate system. Scalar products are not only useful in 
describing angles but form the basis for deriving analytic expressions 
for areas and volumes as well. 

We conclude from the Cauchy-Schwarz inequality (8) that the 
scalar product satisfies the inequality 


(16) |A- B/S|A| |B], 


which just expresses that |cos y| < 1. We shall see (p. 191) that the 


1Since the scalar product of two vectors is not a vector but a scalar, there is no 
associative law involving scalar products of three vectors. 
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equality in (16) holds only if the vectors A and B are parallel or if at 
least one of them is the zero vector. 
We notice that by (6a), (13) for B= A 


(17a) A-A=|A]?, 


That is, the scalar product of a vector with itself is the square of its 
length. This also follows from (14), since the vector A forms the 
angle y = 0 with itself. The important relation 


(17b) A-B=0 


for nonzero vectors A, B corresponds to cos y =0 or y= 2/2. It 
characterizes the vectors A, B as “perpendicular” or “orthogonal” 
or “normal” to each other. On the other hand, A- B > 0 means 
cos y > 0; that is, we can assign to y a value with 0 < y < 1/2; the 
directions of the vectors form an acute angle. Similarly, A B < 0 
means that the vectors form an angle with 1/2 < y < 7, an obtuse 
angle, with each other. 
For example, the two coordinate vectors (see p. 123) 


Ei = (1,0,0,...,0) and E: = (0,1,0,..., 0) 

are orthogonal to each other, since 

Fi - E2 = 1-0 + 0-1+0-0+-+ + » + 0-0 = 0. More generally, any 
two distinct coordinate vectors E; and Ex are orthogonal: 
(17c) Ki - Ex = 0 (i + k). 
For k = i, we have, of course, 
(17d) E; - Ei =|Ei |? = 1; 
the coordinate vectors have length 1. 


e. Equation of Hyperplanes in Vector Form 


The locus of the points P = (x1, . . . , xn) in n-dimensional space 
kR” satisfying a linear equation of the form 


(18) Q1X1 + azx2 ++ + + + AnXn =c 


(where ai, dz, . . . , @n do not all vanish) is called a hyperplane. The 
prefix “hyper-” is needed because n-dimensional space contains 
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“planes,” or “linear manifolds,” of various dimensions; the hyper- 
planes can be identified with the (n — 1)-dimensional euclidean spaces 
contained in the n-dimensional space R”. They are the ordinary two- 
dimensional planes in three-dimensional space, the straight lines in 
the plane, the points on a line. 

Introducing the vector A = (a1, a2,..., an) and the position 


vector X = (xı, . . . , Xn) = OP of the point P, we can write equation 
(18) in vector notation as 


(18a) A-X=c (A + 0). 


Let Y = (y1,..., Yn) = OQ be the position vector of a particular 
point Q of the hyperplane, so that A.Y =c. Subtracting this 
equation from (18a), we find that the points P of the hyperplane 
satisfy 


(19) O=A-X—-A-Y=A-(X—Y)=A: PÀ. 


Hence the vector A is perpendicular to the line joining any two 
points of the hyperplane. The hyperplane consists of those points 
obtained by proceeding from any one of its points Q in all directions 
perpendicular to A. We call the direction of A “normal” to the 
hyperplane (see Fig. 2.5). 


Figure 2.5 Law of formation of third-order determinant. 
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The hyperplane with equation (18a) divides space into the two 
open half-spaces given by A X < c and A » X >c. The vector A 
points into the half-space A» X > c. By this we mean that a ray from 
a point @ of the hyperplane in the direction of A consists of points 
whose position vectors X satisfy A+ X > c. Indeed the position 
vectors X of points P of such a ray are given by 


X = OP=0Q9+AA=Y+1A 


[see (12) ], where Y is the position vector of Q and A is a positive 
number. Then obviously 


A*-X=A-Y+A-AA=c+H+AlAl[?2>c. 


More generally, any vector B forming an acute angle with A points 
into the half-space A - X > c, since A+ B > 0 implies that 


A-X=A-(Y+AB)=A-Y+AA-B>c. 


If the constant c is positive, the half-space A - X < c will be the one 
containing the origin, since A - O = 0 < c. Then A has the normal 
direction “away from the origin”. 

The linear equation (18a) describing a given hyperplane is not 
unique. For we can multiply the equation with an arbitrary constant 
factor à + 0, which amounts to replacing the vector A by the parallel 
vector AA and the constant c by Ac. If c = 0—that is, if the hyper- 
plane does not pass through the origin—we can choose 


Multiplying (18a) by A, we obtain the normal form of the equation 
of the hyperplane 


(20) B-X=p 


Here p is a positive constant, and B is the unit normal vector pointing 
away from the origin. The constant p in equation (20) is simply the 
distance of the hyperplane from the origin 0, that is, the shortest 
distance of any point of the hyperplane from 0. For let P be any point 
of the hyperplane and let X be the position vector of P. Then the 
distance of P from the origin 0 is given by 


|OP| =|X|=|X| |B]. 
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It follows from (16), (20) that 


\OP|>B-X=p. 


Equality holds for the special point P of the hyperplane with position 
vector 


OP = X = pB. 


The line joining this point to the origin has the direction of the 
normal to the hyperplane. More generally we can find the distance 
d of any point Q in space with position vector Y from the hyperplane. 
As the reader may verify by himself, 


(20a) d=|B-Y-—p|. 


f. Linear Dependence of Vectors and Systems of Linear Equations 


Many problems in mathematical analysis can be reduced to the 
study of linear relations between a number of vectors in n-dimensional 


space. A vector Y is called dependent! on the vectors Ai, A2, . . ., Am 
if Y can be represented as a “linear combination” of Ai, . . . , Am, 
that is, if there exist scalars xı, . . ., Xm such that 

(21) Y = x1Ai + x2Ag + + © © + xmAm. 


Here m is any natural number. The zero vector is always dependent, 
since it can be represented in the form (21) choosing for all the 
scalars x; the value 0. Dependence of Y on a single vector Ai + 0 
means that either Y = 0 or that Y is parallel to Ai. Choosing for 


Ai, ..., Am the n coordinate vectors 
(22) Ei =(1,0,...,0), Ee=(0,1,...,0),..., 
En = (0,0,...,1) 
we see that the relation (21) holds for any vector Y = (y1,.. . , yn) 
if we choose xı = yı, X2 = y2,..., Xn = Yn: 
(23) Y = yiEi + yoko + + * © + YnEn. 


1What we call here “dependent” is often called “linearly dependent” in the liter- 
ature. Since we do not consider any other kind of dependence between vectors, we 
drop the word “linear.” | 
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Thus, every vector in space is dependent on the coordinate vectors. 

On the other hand, none of the n coordinate vectors Ei is dependent 
on any of the others, as is easily seen. More generally, a vector Y + 0 
cannot be dependent on vectors Ai, Az, . . . , Am if Y is orthogonal to 
each of the vectors Ai, . . . , Am. For multiplying relation (21) scalarly 
by itself yields that 


I\YJ2=Y-Y=Y-(x1Ai + x2eA2 + ° © © +XmAm) 
= xı Y »• Aı + x2Y+>Aots+ + © + XmY ° Am = 0, 


and hence that Y = 0. 


We call the vectors Ai,..., Am dependent if there exist scalars 
xı, X2, . . . , Xm that do not all vanish, such that 
(24) x1A1 + X2Å2 +e.. e+ XmÅm = 0. 
If Ai,..., Am are not dependent — that is, if (24) holds only for 
xı = X2 = + + + = Xm = 0 — we call Ai, . . . , Am independent. For 
example, the coordinate vectors Ei, . . . , En are independent, since 
0 = xiEi + x2E2 + + © © +XnEn = (x1, x2, .. . , Xn) 


obviously implies that xı = x2 =+ +» = Xn = Q. 

The two notions of “dependence of a vector on a set of vectors” 
and “dependence of a set of vectors” are closely related. A number 
of vectors are dependent if and only if we can find one of them that 
is dependent on the others. For, obviously, relation (21) expressing 
that Y is dependent on Ai, . . . , Am can be written in the form 


xiAi ++ + + + XmAm + (—1)Y = 0, 


which shows that the m + 1 vectors Ai, Ae,..., Am, Y are de- 
pendent. Conversely, if Ai, . . . , Am are dependent, we have a relation 
of the form (24) where not all coefficients x; vanish. If, say, x; does 
not vanish, we can solve equation (24) for Ax, expressing Ax as a 
linear combination of the other vectors. 


Dependence of the vector Y on the vectors Ai, . . . , Am means that 
a certain system of linear equations has solutions x1, ..., xm. For 
let Y = (yi, . . . , Yn), and let the vector Ax be given by 
Ak = (Qik, Gek, . . . , Ank). 


Then the vector equation (21), written out by components, is equiva- 
lent to the system of n linear equations 
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Q11X1 + dioxe + ° ° © + AimXm = Y1 


Q21X1 + A22X2 + ° © © + AamXm = Y2 


(25) 

AnixX1 + Anex2 + ° © © + AnmXm = Yn 
for the unknown quantities x1,...,Xm. Obviously, Y is dependent 
on Ai,..., Amif and only if the system (25) posesses at least one 
solution x1,..., Xm. Similarly, the vectors Ai,..., Am are de- 


pendent if and only if the “homogeneous” system of equations 


Q11X1 + Ai2X2 + + © © + QimXm = 0 


G21X1 + A22x2 + + © * + AamXm = 0 


(25a) 
AnixX1 + An2x2 + ° © * + AnmXm = Q. 


has a “nontrivial” solution x1,..., Xm, that is, has a solution 
different from the trivial solution! 


X1 = X2 = + e e =Xm = Q. 


We found one set of n vectors in n-dimensional space that are 
independent, namely, the coordinate vectors Ei, . . ., En. Basic for 
the theory of vectors is the fact that n is the maximum number of 
independent vectors: 


FUNDAMENTAL THEOREM OF LINEAR DEPENDENCE. Every n+1 
vectors in n-dimensional space are dependent. 

Before proving this theorem we consider some of its far-reaching 
implications. We can conclude immediately that any set of more than 
n vectors in n-dimensional space is dependent. For any dependence 
(24) between the first n + 1 of m vectors can be considered a de- 
pendence of all m vectors, if to the remaining vectors we assign the 
coefficient 0. The fundamental theorem then implies: The system of 
homogeneous linear equations (25a) always has a nontrivial solution if 
m > n, that is, if the number of unknowns exceeds the number of 
equations. 

We can formulate the last statement geometrically in a different 
way, if we interprete each of the equations (25a) as stating that a 


1Equations of the type P(x1, x2, . . . , Xm) = 0 where P is a homogeneous polynomial 
(see p. 13) are called homogeneous. They always have the trivial solution xı = 
x2 =+ © © = Xm = 0. Moreover any solution x1,..., Xm stays a solution if we 
multiply all of the x; by the same factor À. 
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certain scalar product of two vectors in m-dimensional space vanishes. 
A nontrivial solution xı, . . . , Xm then corresponds to a vector X = 
(x1, ..., Xm) Æ 0. The vanishing of the scalar product of two non- 
vanishing vectors means that the vectors are perpendicular to each 
other. Equations (25a) state that X is perpendicular to the n vectors 
(@11, 12, . . . , Qim), (@21,@22, . . . , Gam), . . . , (Ani, an2, . . . , Anm). We 
have then: Given a set of nonvanishing vectors whose number is less 
than the dimension of the space, we can find a vector that is perpen- 
dicular to all of them (and hence, by p. 137, is independent of them). 

Returning to vectors in n-dimensional space, we observe a further 
consequence of the fundamental theorem: Every vector Y in n-di- 
mensional space is dependent on n given vectors Ai, . . . , An, provided 
Ai, ..., An are independent. For since the n + 1 vectors Ai,..., 
An, Y must be dependent, we have a relation of the form 


21Ai + Z2Å2 ++ > © + ZnAn + Zn+ı Y = 0, 


where not all of the quantities 21, . . . , Zn+i vanish. Then 2n+1 + 0, 
since otherwise A1,..., An would be dependent, contrary to as- 
sumption. It follows that 


(26) Y = x1Ai + x2A2 ++ + + + XnAn 
where 
_ Zi r 
Xi = PE (G@=1,...,n). 


Incidentally, the coefficients xz in the representation (26) of Y asa 
linear combination of the independent vectors Ai,..., An are 
uniquely determined, for if there were a second representation 


Y = yiAi + y2A2 ++ + © + ynAn 
it would follow by subtracting that 
(xı — y1)A1 + (x2 — y2)Ag + + © © + (Xn — yn)An = 0. 


Here for independent vectors Ai,..., An we conclude that all 
coefficients vanish and hence that xı = y1, . . . , Xn = Yn. 

On the other hand, if Ai, . . . , An are dependent, we certainly can 
find a vector Y that does not depend on Ai, . . . , An, for in that case, 
one of the vectors A1,..., An is dependent on the others, say An 
on Ai, ..., An-1; a vector Y dependent on Ai, . . . , An is then also 
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dependent on Ai, ..., An-1. There are, however, vectors Y in n-di- 
mensional space that do not depend on n — 1 given vectors (see 
p. 139). 

Since independence of Ai, . . . , An is equivalent to the fact that 
the corresponding system of homogeneous linear equations (25a) has 
only the trivial solution, we have deduced the following basic theorem 
on solvability of systems of linear equations from the fundamental 
theorem: 


The system of n linear equations 


Q11X1 + A12X2 + * * © + AinXn = 91 


Q21X1 + d22X%2 + + © © + AgnXn = y2 


(27) 
QniX1 + An2xX2 + * © © + AnnXn = Yn 


has a unique solution x1, . . . , Xn for any given numbers y1,..., Yn 
provided the homogeneous equations 


Q11X1 + Qiox2 + + © © + ainxXn = 0 


(27a) az21ıxı + a22%2 + ° o + A2anxXn = 0 


Qn1X1 + an2x2 ++ © © + AnnxXn = O 


have only the trivial solution xı = x2 = + + + = xn = 0. If the system 
(27a) has a nontrivial solution we can find values yi,..., Yn for 
which the system (27) has no solution. 

We have here a pure existence theorem, that gives no indication, 
how the solution x1, x2. . . , Xn, if it exists, can actually be obtained. 
This can be achieved by means of determinants, as discussed in 
Section 2.3 below. 

We proceed to the proof of the fundamental theorem, using in- 
duction over the dimension n. The theorem states that any n+ 1 
vectors Ai,..., An, Y in n-dimensional space are dependent. For 
n = 1, vectors become scalars, and the statement to be proved is the 
following: For any two numbers Y and A we can find numbers xo, xı, 
which do not both vanish, such that 


xoY + 1A = 0. 


This is trivial. If Y = A = 0, we take xo = xı = 1; in all other cases, 
we take xo = A, xı = —F. 
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Assume that we have proved that any n vectors in (n — 1)-di- 
mensional space are dependent. Let Ai,..., An, Y be vectors in 
n-dimensional space. We want to prove that Ai, ..., An, Y are de- 
pendent. This is certainly the case, if Ai, . . . , An alone are already 
dependent. Thus we restrict ourselves to the case that Ai,..., An 
are independent; we shall prove that then Y is dependent on Ai,.. ., 
An. It is sufficient to prove that each of the coordinate vectors Ki, .. . 
En in (22) is dependent on Ai, . . . , An, for any vector Y is, by (23), a 
linear combination of the E; and hence also of the Ax if the Ei; can 
be expressed in terms of the Ax. We shall prove only that En is de- 
pendent on Ai, ..., An, since the proof for the other E; is similar. 
We only have to show that the system of equations 


Qi1x1 + Qizx2 + + + * + QinXn = 0 
(28) G=1,...,n—1) 
GQnix1 + Anex2 ++ © © + AnnXn = 1 


has a solution x1, . . . , Xn. Now the first n — 1 equations, which are 
homogeneous, have a nontrivial solution x1, . . . , Xn as a consequence 
of the induction assumption that n vectors in (n — 1)-dimensional 
space are dependent. For that solution, let 


Qn1X1 + An2X2 + * © © + AnnXn = cC. 


Here c Æ 0, since otherwise the vectors Ai,..., An would be de- 
pendent. Dividing xı, x2,..., Xn by c, we obtain then the desired 
solution of the system (28). This completes the proof of the funda- 
mental theorem. 


Exercises 2.1 


1. Give the coordinate representation of the line passing through the 
point P = (—2, 0, 4) and in the direction of the vector A = (2, 1, 8). 

2. (a) What is the equation of the line passing through the points P = 

(3, —2, 2) and Q = (6, —5, 4)? 
(b) Give the equation of the line passing through any two distinct 
points P and Q. 

3. If A and B are two vectors with initial point O and final points P and 
Q, then the vector with O as initial point and the point dividing PQ 
in the ratio à: (1—3) as final point is given by 

(1 —aA)A + AB. 


4. In Exercise 3, for what values of 4 does the position vector correspond 
to a point on the ray in the direction of Q from P? 


5. The center of mass of the vertices of a tetrahedron PQRS may be 
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10. 


12. 


13. 


14. 


defined as the point dividing MS in the ratio 1:3, where M is the center 
of mass of the vertices PQR. Show that this definition is independent 
of the order in which the vertices are taken and that it agrees with the 
general definition of the center of mass (Volume I, p. 373). 


. Two edges of a tetrahedron are called opposite if they have no vertex 


in common. For example, the edges PQ and RS of the tetrahedron of 
Exercise 5 are opposite. Show that the segment joining the midpoints 
of opposite edges of a tetrahedron passes through the center of mass of 
the vertices. 


. Let Ai, ..., An be n arbitrary particles in space, with masses, mı, 


m2,..., Mn, respectively. Let G be their center of mass and let Ai 
...,An denote the vectors with initial point G and final points 
A1,..., An. Prove that 


mAi + MA2 + ° © ¢ + mnAn = 0. 


. The real numbers form a one-dimensional vector space where addition 


of “vectors” is ordinary addition and multiplication by scalars is 
ordinary multiplication. Show that the positive real numbers also form 
a vector space where addition of vectors is ordinary multiplication and 
scalar multiplication is appropriately defined. 


. Verify that the complex numbers form a two-dimensional vector space 


where addition is ordinary addition and the scalars are real numbers. 


Let P and Q be diametrically opposite points and R any other point on 
a sphere. Show that PR meets QR at right angles. 


. (a) Obtain the normal form ofthe plane through the point P = (—3, 2, 1) 


and perpendicular to the vector A = (1, 2, —2). 

(b) What is the distance of the point Q = (1, —1, —1) from the plane? 

(c) Do O and Q lie on the same or opposite sides of the plane? 

(a) Let the equation of a hyperplane be given in the form (18). Deter- 
mine the coordinates of the foot of the perpendicular from a point 
P to the hyperplane. 

(b) In Exercise 11, give the feet of the perpendiculars from O and Q on 
the plane. 


Let A and B be nonparallel vectors. Show that 


A+B 


— TBP 


C=A 


is perpendicular to B. The vector C is called the component of A perpen- 
dicular to B. 


Find the angle ¢ between the plane 
Ax + By + Cz+ D=0. 
and the line 


x = xo + at, y = yo + Bt, z = Zo + yt. 
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2.2 Matrices and Linear Transformations 


a. Change of Base. Linear Spaces 


Every vector Y in n-dimensional space R” can be written asa linear 


combination of the coordinate vectors Ei, . ., En defined by (22); 
namely, l 
(29) Y= yıEı +e 2 e+ YnEn, 


where the y; are the components of Y. We can generalize the notion of 
coordinate vector and of components by considering any m inde- 
pendent vectors Aı, . . ., Amin Sn. If Y is a vector dependent on the 
Ai, we have 


(30) Y= xıAı + * + ++ xXmAm 


where the coefficients x; are determined uniquely by Y. We call x1, . 
. . , Xm the components of Y with respect to the base Ai, . . . , Am. With 
respect to this base, the base vector Ai has the components 1,0, . . 
. ,0; the base vector Ag, the components 0,1, .. .,0; and so on. 
For any scalar À the vector 


NY = Ax1A1 + ° © © + AXmAm 


also is dependent on the A; and has components Ax1,..., AXm. 
Similarly, if 


Y’ = x1'A1 + ° © © + Xm/Am 
is a second vector depending on the Ai, the sum 
Y + Y’ = (x1 + x'1)A1 +e 2 6 + (Xm + Xm')Am 


has the components xı + X1',.. . , Xm + Xm’ with respect to our base. 

For m < n not all vectors Y in n-dimensional space are dependent 
on Ai,..., Am. The vectors dependent on m independent vectors 
are said to form an m-dimensional vector space. We can visualize such 
a space by choosing an arbitrary point Po with position vector B = 


OP» as initial point for all the vectors Ai, ...,Am. Let 
(31a) Ai = PoP: (@=1,...,m) 


and let Y = PoP be the vector given by (30). Then the point P has the 
position vector 
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(31b) OP = OP) + PoP = B + x1A1 + + + + + xXmAm. 


The points Pin relation (31b) are said to form the m-dimensional linear 
manifold Sm through Po spanned by the vectors Ai,..., Am. Every 
point P in Sm uniquely determines values x1, . . . , Xm, which we call 
affine coordinates for P. In this affine coordinate system for Sin 
the “origin” — that is, the point with xı = x2 = + » » = xm = 0 — is 
the point Po; the point with affine coordinates xı = 1,x2 = + » * = Xm 


= 0 is Pi, the end point of the vector Ai = PoP1, and so on. For two 
points P and P’ of Sm with position vectors 


OP = B +x1A1 t+ + + +xmAm, OP =B +r Aite. 
+ Xm Åm, 


the vector 


PP’ = OP — OP = (x1 — x)A1 + + © © + (Xm — Xm)Am 


has as components with respect to the base A1, . . . , Am the differences 
of the affine coordinates of the points P and P”. 

According to our definition a one-dimensional linear manifold Sı 
through the point Po is the locus of points P with position vectors of 
the form 


OP = B + x1Aı 


where B and A: are fixed vectors, (Aı ~ 0) and xı ranges over all 
real numbers. Of course, Si is merely the straight line through Po 
parallel to the direction of the vector A: (see p. 130). A two-dimen- 
sional linear manifold or two-dimensional plane Sz consists of the 
points P with position vectors 


OP = B+ x1Ai + x2Ae 


where B, Ai, Ag are fixed vectors (Ai and Az independent) and xı and 
x2 range over all real numbers. The n-dimensional linear spaces Sy 
are identical with the whole space R”; for any vector Y is dependent 
on n linearly independent vectors Ai, . . . , An (see p. 133), and hence 
the position vector of any point P is representable in the form 


OP =B + xA +: > 6 + XnAn. 
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The (n — 1)-dimensional linear manifolds can be seen to be identical 
with the hyperplanes defined on p. 133. For given any n — 1 vectors Aj, 
. . .,An-1 in n-dimensional space, we can find a vector A perpen- 
dicular to all of them (see page 139.) Then for 


OP = B + «Ai + © e e + Xn-1 An-1 


we have the relation 


A-OP=B-Ad¢m-Ar-At:++4%n-1An1°-A=B-A 


= constant, 


which is just a linear equation for the coordinates of P. 

In general, the determination of the components x: of a vector 
Y with respect to a base Ai, . . . , Am requires the solution of a system 
of linear equations of the type (25). In one important special case, the 
xi can be found directly, namely, when the base vectors form an 
orthonormal system. We call the vectors Ai, .. . , Am orthonormal 
if each of them has length 1 and any two are orthogonal to each other, 
that is, if 


e warn REISE 


If a vector Y is of the form 
Y = x1Ai + x2A2 + + + © +XmAm, 
we find, using the orthogonality relations (32), that 


(33) YA; = x1Aı ° Ag + x2A2+ Ai te © © +xXmAm>+ Ait =X 


(i=1,...,m). 


In particular, Y = 0 implies x; = 0 fori =1,...,m; thus orthonor- 
mal vectors always are independent. Formula (33) shows that the 
component x; of the vector Y with respect to an orthonormal base 
Ai, ...,Am is equal to the component Y «+ A; of the vector Y in the 
direction of Ai. The coordinate vectors Ei, . . . , En defined by equa- 
tions (22) form just such an orthonormal base, and the components 
of the vector Y = (yi, . . . , yn) with respect to this base are the quanti- 
ties Y - E; = yi. 

An orthonormal base is also distinguished by the fact that the 
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length of a vector and the scalar product of two vectors is given by the 
same formulae as in the original base Ei, .. ., En. Given any two 
vectors Y and Y’ of the form 


(84a) Y = x1ıAı + ° + + + xmAm, Y’ = xA te © © +xXm/Am 
we have 


(34b) Y- Y'= (x1Aı ++ © © + XmAm) ° (x1'A1 + ° © © + Xm Am) 
= x1A1* (x1'A1 + + © + + Xm/Am) ++ ° > 
+XmAm ° (x1'A1 + © © © + Xm'Am) 
= X1% + XeXe’ +e -e e + XmXm.! 


In the particular case Y’ = Y we find for the length of the vector 
Y the formula 


(34c) LY) = VV + Y = Vx? t. o o Fôím. 


If the m-dimensional linear manifold Sm through the point Po is 
spanned by m orthonormal vectors Ai, . . . , Am, the corresponding 
affine coordinate system is called a Cartesian coordinate system for 
the space Sm. The coordinate vectors Ai, . . . , Am are mutually per- 
pendicular and of length 1. The distance d between any two points 
with Cartesian coordinates (x1, . . . , £m) and(x1’,. . . , Xm’) is given 
by the formula 


d = V(x! — x1)? + e + + + (Xm! — Xm)? 


More generally any geometric relation based on the notion of distance 
(such as angle, area, volume) has the same analytic expression in any 
Cartesian coordinate system. 


b. Matrices 


The relation 
(35a) Y= x1A1 +e e e+ XmÅnm 


between vectors Ai, . . . , Am, Yin n-dimensional space canbe written 
as a system of linear equations [see (25), p. 138] 


1Without the orthogonality relations we could only conclude that Y + Y’ is given 


by the more complicated expression 


Y.Y = » CikXiXk where Cik = Ai Arg. 
1, 
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@11X1 + @12X2 + ° © © + AimXm = Yı 


(35b) 


@21X1 + a22x2 + ° * © + AamXm = Y2 


AniX1 + Anex2 + ° © © + AnmXm = Yn 


connecting the components yı, . . . , Yn of the vector Y in the original 
coordinate system with the components x1, . . . , Xm of Y with respect 
., Qn) for i=1,...,m. The 
linear relations (35b) between the quantities x; and y; are completely 
described by the system of n x m coefficients aj. The system of 
coefficients arranged in a rectangular array 


to the base vectors A; = (a1, aa, . . 


(36) a={|- >œ 


Anl Qn2 e 


as they appear in (35b) is called a matrix. 
(We shall usually denote matrices by boldface lower-case letters). 
The matrix a in (36) has mn “elements” 


Aji} J=1,...,n; 


i=1,...,m. 


These elements are arranged in m “columns” 


dii a12 
a21 Q22 

e e ? ° 
Ani An2 


or in n “rows” 


(a ar °° 


(a21 a22 >» » 


(anı Qn2 * ° 


aim); 


dam), 


Anm). 


Two matrices are considered equal only if they agree in the number 
of rows and columns and if corresponding elements are the same. 
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The columns of the matrix a can be identified respectively with the 
set of components of the vectors Ai, A2, . . ., Am. We shall often write 
the matrix a whose columns are formed from the components of the 
vectors Ai, A2, . . ., Am as 


(37) a = (Aj, Ao, . . ., Am). 


The system of equations (35b) expressing the n quantities y1,..., 
Yn as linear functions of the m quantities x1, . . . , Xm can be compress- 
ed into the single symbolic equation 


(38) aX = Y, 
where X stands for the vector (xı, . . ., Xm) and Y for the vector 
(Yi, . - . , Yn). If the column vectors Aı, . . . , Am of the matrix a are 


independent, we can interpret (38) as describing a change of base or 
of coordinate system for vectors. 

The equation connects the components x1, . . . , Xm of the vector 
with respect to the base Ai,...,Am in the subspace Sm with the 
components yi, . . ., Yn of the same vector with respect to the base 
Ei, . . ., En for the whole space Sn. This might be called the ‘‘pas- 
sive” interpretation of (38), in which the geometrical objects—the 
vectors—stay fixed and only the reference system is switched. 

There is another, “active” interpretation, in which the vectors 
change rather than the coordinate system. Equations (86) then de- 
scribe a mapping of vectors (xı, . . . , Xm) in an m-dimensional space 
onto vectors (yi, . . . , Yn) inan n-dimensional space. A mapping given 
by equation (38), or in more detail by the equivalent system of equa- 
tions (35b), is called linear, or affine.' 


1In an affine mapping of vectors the components y; of the image vector Y are homo- 
geneous linear functions of components x: of the original vector X, as in formulae 
(35b). If we identify X and Y with position vectors of points, formulae (35b) define a 
mapping of points (x1, . . ., Xm) in the space R” onto points (y1, . . ., yn) in the space 
R”. The point mappings obtained in this way are the special affine mappings that 
take the origin of R” into the origin of R”. The most general affine mapping of points 
is given by inhomogeneous linear equations 


m . 
(*) Yi = 2 anxi + by (j=1....,n) 
1= 


(It can be obtained from a special mapping taking the origin into the origin by a 
translation with components b;). Applying the mapping (*) to two points P’ = 


(x1',..., Xm’), P” = (x1",.. . , Xm”) with images Q’ = (y1’,. . . , Yn’), Q” = Y1”, 
—— 
. . , yn”), we see that the corresponding mapping of the vectors P’ P” = (x1" — x1’, 
—a 
., Xm” — Xm') = (X1, . . ., Xm) onto the vectors Q’ Q” = (y1 — Yr,..., 


Yn” — yn’) = (Y1, . . » , Yn) is given by the homogeneous equations (35b). 
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For example the system of equations 


1 
X2, yo=—tn +? 


2 1 
(38a) y= 53% — 3 3 


3 3 X2, 


1 
y3 = — 3 %1 — 3X2 


corresponding to the matrix 


| 
Cole wWIN Wik 


| 
Coie cle WIN 


can be interpreted as a mapping of vectors X = (x1, x2) in the plane 
onto vectors Y = (yı, y2, y3) in three-dimensional space. Here the 
image vectors all satisfy the relation 


(38b) yı + y2+ 473 =0 


and hence are orthogonal to the vector N = (1, 1, 1). Identifying the 
vectors X, Y with position vectors of points, we have in (38a) a map- 
ping of the xı x2-plane onto the plane 7 in yı y2 ys-space with equation 
(38b). Geometrically the point (y1, y2, y3) is obtained by projecting the 
point (x1, x2, 0) perpendicularly onto the plane r.t Alternately, equa- 
tions (38a) can be interpreted passively as a parametric representation 
for the plane x, with xı and x2 playing the role of parameters. 

Different matrices give rise to different linear mappings, for by 
(35b) the coordinate vectors 


Ei=(1,0,...,0), Es=(0,1,...,0),... 


are mapped onto the vectors 


Ai = (a11, Q21,-.- +, an1), A2 = (a12, A22. e3 an2), e. 
Thus, the column vectors Ai, A2, . . . , An of the matrix a are just the 
images of the coordinate vectors Ei, Ee, . . . , En. Hence, the matrix 


a is determined uniquely by the mapping. 


1The line joining (x1, x2, 0) and (41, Y2, ys) is parallel to the normal N of 7. 
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Of particular importance are the linear mappings Y = aX of the 
n-dimensional vector space into itself; they mapavector X = (x1,..., 
xn) onto a vector Y = (yı, . . . , Yn) with the same number of compo- 
nents. Such mappings correspond to matrices a with as many rows 
as columns, so-called square matrices.1 Written out by components, 
the mapping Y = aX corresponding to a square matrix a with n rows 
and columns takes theform (27). p.140. The basic theorem of solvability 
of systems of n linear equations for n unknown quantities (p. 140) 
can now be stated alternatively as follows: 

For a square matrix a there are two mutually exclusive possibili- 
ties: | 

(1) aX +0 for every vector X + 0 

(2) aX = 0 for some vector X + 0. 

In case (1) there exists for every vector Y a unique vector X such that 
Y = aX. Incase (2) there exist vectors Y for which the equation Y = aX 
holds for no vector X.? 

We call the matrix a singular in case (2) and nonsingular in case 
(1). Since existence of a nontrivial solution X of the equation aX = 
0 is equivalent to dependence of the column vectors of the matrix 
a, we see that a square matrix a is singular if and only if its column 
vectors are dependent. 


c. Operations with Matrices 


It is customary to denote the elements of a matrix a as in (36) by 
letters bearing two subscripts, such as aj. The subscripts indicate 
the location or address of the element in the matrix, the first subscript 
giving the row number, the second the column number. For a matrix 
with n rows and m columns having elements aj; the subscript j ranges 
over 1,2,...,n and the subscript i over 1,2, . . . , m. Equation (36) 
is often abbreviated into the formula 


a= (aji), 


which only exhibits the elements of the matrix a but does not show 
the numbers of rows and columns, which have to be deduced from the 
context.? In the example 


1The more general matrices with arbitrary numbers of rows and columns are referred 
to as rectangular matrices. 

2In case (1) the equation Y = aX represents a 1-1 mapping of the n-dimensional 
vector space onto itself. In case (2) the mapping is neither 1-1 nor onto. 

3The letter a in aj: is the name of a real-valued function of the independent variables 
j and i. The domain of this function consists of the points in the j, i-plane whose 
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1! 2! 3! e.e. m! 

2! 3! Al e.. (m+!) 
a = (ax) = 8! 4! 5! eeo (m + 2)! 

n! (n+ 1)! (n+2)!- - .(m+n-—1)! 


we have ay = (i + j — 1)! 

Addition of matrices and multiplication of matrices by scalars are 
defined in the same way as for vectors. If a = (ay) and b = (by) 
are matrices of the same “‘size’’—that is, with the same numbers of 
rows and columns—we define a + bas the matrix obtained by adding 
corresponding elements: 


a+b = (ay + dy). 


Similarly, for a scalar à we define àa as the matrix obtained by 
multiplying each element of a by the factor à: 


àa = (Aaji). 
One verifies immediately the rules 
(39) (a + b) X = aX + bX, (Aa) X = (aX) 


for the mappings of vectors X determined by the matrices. 

More significant is the fact that matrices of suitable sizes can be 
multiplied with each other. A natural definition of the product of two 
matrices a, b is obtained by considering the symbolic product, or 
composition, of the corresponding mappings (see Volume I, p. 52). If 


a = (aj)is a matrix with m columns and n rows, and if X = (x1, . . . , Xm) 
is a vector with m components, then a determines the mappings 
Y = aX of the vector X onto the vector Y = (yı, . . . , yn) with the 


n components 
m . 
Yi = Dy Aixi G=1,...,n). 
= 


If now b = (bx;) is a matrix with n columns and p rows, then the 


coordinates are integers with 1 Sj =n, and 1 £< i < m. Ordinarily we write a 
function f of two independent variables x, y as f (x, y), and a more consistent notation 
here would be a(j, i) instead of the customary ayji. 
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mapping Z = bY will map Y onto the vector Z = (21, . . . , Zp) with the 
p components 


n n m m 
Ze = Dd) bri yi = Da D1 On; Qn Xi = X Cri Xi, 
j=l j=l 1=1 1=1 
where 
n . 
(40) Cri = 2, bxy aj (k=1,...,p;i=1,..., m). 
£E 


Thus Z = cX, where c = ba = (cxi) is the matrix with p rows and 
m columns and with elements given by formula (40). Accordingly, we 
define the product c = ba of the matrices b and a as the matrix with 
elements cx: given by (40). | 

We observe that the product ba is defined only if the number of 
columns of b is the same as the number of rows of a. This corresponds 
to the obvious fact that the symbolic product of two mappings can 
only be formed, if the domain of the first factor contains the range 
of the second one. Thus it could happen very well that the product 
ba is defined but not the product ab with the factors in the reverse 
order. But even where both ba and ab are defined the commutative law 
of multiplication ab = ba in general does not hold for matrices. 
For example, for 


we have 


0 —l 0 1 
TH 
—1 0 1 0 


However, one easily verifies from formula (40) that matrix multi- 
plication obeys the associative and distributive laws 


(41a) a(bc) = (ab)c, 
(41b) a(b + c) = ab + ac, (a + b)c = ac + be, 


(for matrices of appropriate sizes). We might say that all algebraic 
manipulations for matrices are permitted as long as the products 
involved are defined and we do not interchange factors. 
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The mapping of vectors determined by the matrix a, which we had 
written as Y = aX, can be considered a special example of matrix 
multiplication provided we write X and Y as “column vectors,” that 
is, as matrices with a single column and with m and n rows, respec- 
tively: 


x1 yı 

X2 y2 
X =| e |, Y=/ - 

Xm Yn 


d. Square Matrices. The Reciprocal of a Matrix. Orthogonal 
Matrices 


Of particular importance in applications are the matrices with the 
same number of rows and columns, the so-called square matrices (the 
more general matrices with arbitrary numbers of rows and columns 
are referred to as rectangular matrices). The order of a square matrix 
is the number of its rows or columns. Any two square matrices of the 
same order n can be added or multiplied. In particular, we can form 
powers of such a matrix: 


a? = aa, aè =aaa,*- -. 


The zero matrix 0 of order n is the matrix all of whose elements are 
0, or all of whose columns are zero vectors: 


(42a) 0=(0,0,...,0). 
It has the obvious properties 
(42b) a+0=0+a=a, a0 = 0a = 0 
(for all n-th order matrices a), 
(42c) 0X = 0 for all vectors X with n components. 


The unit matrix, of order n, denoted by e is the matrix correspond- 
ing to the identity mapping of vectors X: 


(48a) eX = X 


for all vectors X. Since then in particular eE; = Ex for all coordinate 
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vectors Ex, we find that the unit matrix has the coordinate vectors as 
columns: 


(43b) e = (Ei, E2, . . . , En) = 


O». è : O þm 
O. >è oo = O 
O». è : O O 
ma e@ è> : © O 


One verifies immediately that e plays the role of a “unit” in matrix 
multiplication: 


(43c) ae=ea=a 


for all n-th order a. 
We call an nth order matrix b reciprocal to the nth order matrix 
a if 


(44) ab = e. 


If b is reciprocal to a, then a corresponds to the inverse of the map- 
ping of vectors furnished by b, for if b maps a vector Y onto X (i.e., 
if X = bY), then a maps X back onto Y, since aX = abY = eY = Y. 
More concretely, if we know a reciprocal b of the matrix a = (aj), 
we can write down a solution X = (x1, x2, ..., Xn) of the system of 
linear equations 


Q11X%1 + @120X2 + * © © + AinXn = yı 


@21X1 + a22X%x2 + + © © + AanXn = Y2 


QniX1 + Gn2X2 + * © * + AnnXn = Yn 


for any given (y1,..., Yn) = Y. Since abY = eY = Y, we have in- 
deed a solution given by X = bY, that is, by 


xı = buyi + ° © © + Oinyn 


Xn = bniy1 +es e+ bnnyn. 


Every real number a except zero has a reciprocal b for which ab = 1. 
However, there are matrices different from the zero matrix that 
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have no reciprocal. If a has a reciprocal, the equation aX = Y has for 
every vector Y the solution X = bY, since 


abY = eY = Y. 


Hence (see p. 150) the matrix a must be nonsingular; that is, the 
columns of a are independent vectors. Singular matrices have no 
reciprocal. The condition ab = e for the reciprocal matrix b of a can 
be written out in the form 


n 
(45) 2 Qjirbrk = ejk, 


where ajr, brk, ej, denote respectively the general elements of the 
matrices a, b, e. For fixed k we have in (45) a system of n linear equa- 
tions for the vector Bz = (bır, ber, . . . , bnk), which represents the 
kth column of the matrix b. If the matrix a is nonsingular, there exists 
a unique solution Bx of (45) for every k. Hence, a nonsingular matrix a 
has one and only one reciprocal b. 

Let a be any nonsingular matrix and b its reciprocal; that is, ab = 
e. Take an arbitrary vector X and put Y = aX. Since both Z = X and 
Z = bY are solutions of the equations Y = aZ and since the solution 
is unique, we must have 


bY = X 
for every vector X. Hence (see p.149) a is the reciprocal of b: 
ba = e. 


The reciprocal of a nonsingular matrix a is usually denoted by 
a-l. We have 


(46) aa`™l = a`la = e, 


where e is the unit matrix. The reciprocal can be calculated by solv- 
ing the system of linear equations (45) for the brr. Since the elements 
ejk of the unit matrix have the value 0 for j + k and 1 for j = k, equa- 
tions (45) state that the scalar product of the jth row of the matrix 
a with the kth column of the matrix a~! has the value 0 for j Æ k and 
1 for j = k. Furthermore, since a“! a = e we see that the scalar prod- 
uct of the jth row of a~! with the kth column of a also has the value 
0 for j + k and 1 forj = k. 
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Multiplying by reciprocals enables us to “divide” an equation 
between matrices by a nonsingular matrix. For example, the matrix 
equation 


ab = c, 


where a is a nonsingular matrix, can be solved for b by multiplying 
the equation from the left by a-!: 


a-le = a-(ab) = (a—!a)b = eb = b. 


Similarly, the equation 


leads to 
ca! = b. 


From the point of view of euclidean geometry the most important 
square matrices are the so-called orthogonal matrices, which cor- 
respond to transitions from one Cartesian coordinate system to 
another such system or to linear transformations that preserve 
length. A square matrix a is called orthogonal if its column vectors 


Ai, . . ., An form an orthonormal system: 

0 for t+k 
47 Ai + Ar = 
(47) mo" 1 for i=k 


(see p. 145). Since vectors forming an orthonormal system are in- 
dependent, it follows that orthogonal matrices are always nonsingular. 
The vector relation aX = Y corresponding to the matrix a, inter- 
preted passively, describes how the components yı, . . . , Yn of a vector 
with respect to the coordinate vectors E, . .., En are connected 
with the components of the same vector with respect to the base 
Ai, ..., An. For an orthogonal matrix a the base Ai, .. . , An con- 
sists of n mutually orthogonal vectors of length 1, forming a “Car- 
tesian” coordinate system, in which distance is given by the usual 
expression (see p. 146). Interpreted actively, Y = aX represents a 
linear mapping in which the coordinate vectors E; are mapped onto 
the vectors A;. This mapping takes a vector 


X = (x1, . . . , Xn) = X1E1 + +: + ++ xnEn 
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into the vector 


Y = aX = a (xıEı + ° «+ «+ XnEn) = xıaEı ++ + + + xnaEn 
= xıÅı + ° + « + XnÅn. 


The mapping preserves the length of any vector, since by (47) 


Y|? = Y . Y = (x141 + + « + + XnAn) + (x1A1 + + © © + XnAn) 
= x1? +e 6 6 + Xr? =|X]?. 


More generally the mapping preserves the scalar product of any 
two vectors and hence also angles between directions, as is easily 
verified. Such length preserving mappings are known as orthogonal 
transformations, or rigid motions. In two dimensions they are 
easily identified with the changes of coordinate axes discussed in 
Volume I (p. 361). A vector Ai of length 1 in two dimensions is of the 
form Ai = (cos y, sin y) with some suitable angle y. The only 
vectors Az of length 1 that are perpendicular to Ai are 


A2 = (cos (y + A sin (r + 2))= (—sin Y, COS 7} 


and 
Az = (cos ( -5): sin ( -3))= (sin Y, -cos y). 


Thus the general second-order orthogonal matrix is either of the form 


cos y —sin y cos Y sin Y 
(48) a=| . or a=ļ| . . 
sin y cos Yy sin y —cos Y 


The orthorgonality relations (47) permit one immediately to write 
down the inverse a-t of an orthogonal matrix a. We just take for a“! 
the matrix that has the Ax as row vectors; the scalar product of the 
jth row of a~! with the kth column of a is then 0 for j + k and 1 for 
j = k, as required by the relation a“! a = e. Generally, for any matrix 
a = (ajk), one defines the transpose aT = (b;x) as the matrix obtained 
from a by interchanging rows and columns. More precisely bj; = 
axj.1 For an orthogonal matrix we simply have 


1Thinking of a as written out as a rectangular array, one defines the “main diagonal” 
of a as the line running from the upper left-hand corner downward at slope —1. It is 
the line containing the elements a11, @z2, a33, . . .. The transpose of a is obtained by 
“reflecting” a in the main diagonal. 
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(49) a` = afl, 
For example, 


(oe posal = ( cos Y nr) 
sin y cos Y ~ \ —sin y cos Y / 


Following (46) we can write relation (49) as 


(49a) aTa = e, aaT = e. 


The second relation shows that in an orthogonal matrix the scalar 
product of the jth row with the kth row is 0 for j # k and 1 for j = k. 
Thus in an orthogonal matrix the row vectors also form an orthonormal 


system. 


Exercises 2.2 


. In each case describe the space through P spanned by the vectors Ax. 
(a) P = (—1, 2,1); Ai = (4, 0, 3) 

(b) P = (2,1, —4) Aı = (3, —2, 1), A2= (1,0, —1) 

(c) P = (2, 1, —4, 2), Ai = (3, —2, 1, 2), A2 = (1, 0, —1, 2). 


. Verify that Eı = (2/3, 2/3, — 1/3), E2 = (1/v2, —1/V2, 0), Es = (v2/6, 
/2/6, 2V2/3) form an orthonormal base and obtain the representations 
of the given vectors in terms of this base: 


(a) Ai=(V2, V2, V2) 
(b) A2 = (3, —3, 3) 
(c) As = (1, 0, 0) 


. Given linearly independent vectors Ai, A2, . . . , Am, construct mutual- 
ly perpendicular unit vectors Ei, Ee, . . . , Em with the property that 
Ex is a linear combination of Ai, A2, . . . , Ax, fork =1,2,...,m. 


. From the result of Exercise 3, prove the fundamental theorem of linear 
dependence. 
. What is the distance of the point P = (xo, yo, Zo) from the straight line 
given by 
x=at+b, y=ct+d, z=et+f? 
(Hint: Find the foot of the perpendicular from P to the line.) 
. Does the following system of equations have a nontrivial solution? 
x + 2y + 3z=0 
2x + 3y+2z2=0 


10. 


11. 


12. 
13. 
14. 


15. 
16. 
17. 


18. 
19. 


20. 
21. 
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3x +y + 2z=0 


. Find the representation of the vector (a1, a2, a3) with respect to the 


base Ai = (1, 2, 3), A2 = (2, 3, 1), As = (3, 1, 2). 


. Determine the matrix for changing from Cartesian coordinates for the 


base Ei, E2, Es to affine coordinates for the base Ai, A2, As given in 
Exercise 7. 


. Prove that if the matrix a is singular, there exist vectors Y for which 


Y = aX has no solution. 
Obtain the products ab and ba for the matrices 


1 2 0 —2 1 0 
a=!|0 0 1 |, b= | 0 1-2 

2 1 0 1 
Find conditions that the 2 Xx 2 matrix 


(al 

c d 

has a reciprocal and give that reciprocal if it exists. 
Show that there is only one unit matrix. 


Find the reciprocal of ab, if neither a nor b is singular. 

Sometimes a singular n X n matrix is defined as a matrix that maps n- 
dimensional space onto a space of lower dimension. Show that this 
definition is equivalent to the one given here. 

Interpret the matrices in (48) geometrically. 

Prove that a is orthogonal if and only if a? = a^}. 

Show that the transpose of a product ab is the product bfTaT of the 
transposed matrices in reverse order. 

Show that the product of orthogonal matrices is orthogonal. 

Verify that mapping by an orthogonal matrix preserves scalar prod- 
ucts; that is, if a is orthogonal, then (aX) + (aY)= X.Y 

Show that any length-preserving matrix is orthogonal. 


Prove that an affine transformation transforms the center of mass of 
a system of particles into the center of mass of the image particles. 


2.3 Determinants 


a. Determinants of Second and Third Order 


Mathematical analysis includes the study of nonlinear mappings 


in spaces of several dimensions. Such a study, however, has to be 
preceded by one of the linear mappings Y = aX where X and Y are 
vectors and a a matrix. In particular, it is of basic importance to 
analyze the structure of the inverse of such a mapping or—what 
amounts to the same thing—analyze the structure of the solutions of 
a system of n linear equations 
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Q11X1 + Q1eX2 + °» © © + AinXn = Yı 


(50) G2iX1 + A22X2 + * © * + A2gnXn = Y2 


AniX1 + An2ex2 + * © » + AnnXn = Yn 


for n unknown quantities x1, . . . , Xn. 

The process of solving n linear equations in n variables leads to 
certain algebraic expressions called determinants, which have a great 
number of terms. In the beginning, the explicit definition and the prop- 
erties of determinants appear somewhat mystifying. The mystery 
will disappear when we base the definition of determinant on one 
single property, that of being a multilinear alternating form of n 
vectors in n-dimensional space. From this conceptual approach all the 
important properties of determinants can easily be derived. We shall 
see in later chapters of this book that determinants are of the utmost 
importance in extending differential and integral calculus to higher 
dimensions. 

It is instructive to write out the explicit solution of equations 
(50) for the first few values of n. For n = 1 we have the single equation 


@11X%1 = Yı 
with the solution 
yı 
50 xı = —. 
(50a) | + au 


For n = 2 we have the system 


Q11X%1 + Qi2x2 = Y1 
Q21X1 + A22X2 = Y2. 


Multiplying the first equation by az2, the second by aız and sub- 
tracting, we eliminate xz and find a single equation for xı; similarly, 
multiplying the first equation by azı and the second by aii and sub- 
tracting eliminates xı. In this way we find for xı, x2 the expressions 


Q22V1 — a12ye2 ai1y2 — ae1y1 
(50b) xy = 22V T Y2 yo = LY T aN 
@11022 — di2d21 @11022 — @12021 


For n = 3 we have the system 
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Q11X1 + Qi2xX2 + A13X3 = yı 
(50c) Q21X1 + A22X2 + A23X3 = Y2 


a31X1 + A32X2 + A33xXx3 = Y3. 


We can reduce this system to two equations for xı, x2, thus eliminat- 
ing x3, by multiplying the second equation by aı3/a23 and subtracting 
it from the first and by multiplying the third equation by ais3/a33 and 
subtracting it from the from the first. The two resulting equations for 
x1, x2 alone can then be solved as before. After some algebraic ma- 
nipulation we find that 


(50d) 
__ 4220331 + G12@23¥2 + A13A32V2 — @130223 — A23032V1 — 120332 
211022033 + 12023031 + 213021032 — 213022031 — A11023032 — 212021033 ° 


with similar formulae for x2 and x3. For n = 4, the computations be- 
come completely unwieldy and it is clear that only a systematic ap- 
proach can bring order into the results. 

We notice that in each case the solution x; takes the form of a 
quotient, where the denominator is a function of the coefficients a; 
alone, that is, a function of the matrix a = (aj). For n = 1 this func- 
tion is simply the coefficient a1: itself. For n = 2, the denominator 


11022 — @12Q21, 
formed from the elements of the matrix 
( Q11 a12 
a= , 
Q21 Q22 
is called the determinant of the matrix a and written 


aii Q12 
(51a) 211022 — 1221 = det(a) = 


Q21 Q22 


It is clear that the numerators in (50b) also can be written as deter- 
minants, giving rise to the expressions 


yı a2 @11 Jı 
2 a22 a12 2 
(51b) s= ElL, psd l 
Qili 12 a11 12 
Q21 Q22 Q21 Q22 
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Of course, these formulae make sense only if the determinant in the 
denominator does not have the value 0. 

Formula (50d) suggests introducing as determinant of the third- 
order matrix 


Q11 a12 a13 
a= | azı a22 a23 
a3ı a32 a33 
the expression 
(52a) Q11022033 + @12023031 + @13021032 — 213022031 


— @11023032 — @12021033 
ıı a12 a13 
= det(a) = | azı a22 a23 


Q31 a32 233 


The law of formation of such a third-order determinant can be ex- 
pressed by theeasily remembered “diagonal rule” (Fig. 2.5a). We repeat 
the first two columns after the third; form the product of each triad 
of numbers in the diagonal lines, multiplying the products associated 
with lines slanting downward to the right by +1 and to the left by 
—1; and add. (This rule holds only for third-order determinants!). 

With the help of third-order determinants we can write the solution 
of the system (50c) in the more concise form 


| yı G12 413 Q11 Yı Q13 @11 412 Yı 

Y2 Q22 Q23 a21 Y2 Q23 a21 Q22 Y2 

3 @32 A33 a31 Y3 433 a31 432 Y33 

m= 12 xy = + Os1Y38 033| py _ | Gai ase yas | 

Q11 @12 413 Q11 Q12 Q13 Q11 412 Q13 

Q21 Q22 A23 Q21 Q22 Q23 Q21 A22 223 

231 432 A33 a31 232 Q33 a31 232 Q33 

a1 412 x< 139 41 „%12 


7 7 7 
a21 1422 a3 azi a22 


“XX XN 


, a: a32 a33 
7 


A 
a31 a32 
7 
Yo va a Dew 
7 J Yo 
- í + + + 


Figure 2.5a 
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By analogy we define the determinant of the first order matrix 
a = (a11) 
on the basis of (50a) as 
aıı = det(a). 


We see then that in each of the cases n = 1,2,3 the solution (x1, 
..., Xn) of the system (50) can be described as follows (‘““Cramer’s 
rule”): Each unknown x; is the quotient of two determinants. In the 
denominator we have the determinant of the matrix a = (ajx); in the 
numerator we have the determinant of the matrix obtained by re- 
placing the ith column of the matrix a by the quantities y1, Y2, . . . , Yn 
appearing on the right-hand side of the equations. 


6. Linear and Multilinear Forms of Vectors 


In order to define determinants of higher order and to formulate 
their principal properties, it is necessary to make use of some general 
algebraic notions. 


A function f(a1, . . ., an) of the n independent variables a1, ...,@n 
can be considered as a function of the vector A = (a1, . . . , dn) and writ- 
ten in the form f(A). We call f a linear form in A, if 
(53a) f(A + B) = f(A) + f(B) 


for any two vectors A, B and 
(53b) f(AA) = Af(A) 


for any vector A and any scalar i. 
The two rules (53a, b) can be compressed into the single requirement 
that 


(54a) fAA + uB) = Af(A) + uf(B) 


for any vectors A, B and scalars A, u. Written out in detail, the rule 
(54a) becomes 


(54b) fray + ubi, e.. Aan + Ubn) 
= Afla, e. , An) + uf(b1, ve ey bn). 


For example, the function 
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f(A) = 3a2 — 27as 
is a linear form, while 
f(A) =|Al= Va +. + + + an? 


is not. 
Relation (54a) immediately implies the more general rule for linear 
forms 


(54c) f(A + © © © + AmAm) = Anf(A1) + © © © + Amf(Am) 


valid for any m vectors Ai, . ..,Amandscalarsa, . . ., Am. This rule 
yields an explicit expression for the most general linear form in the 
vector A. Using the coordinate vectors Ei, . . ., En, we have by (2b) 
the representation 


A = (a, . . `- , an) = aiki + a2E2 + e o o + anEn 
for the vector A. Hence, by (54c), f is of the form 


(55a) f(A) = af(Œ1) + azf(E2) + + + + + anf(En) 
= C101 + C202 + * ° © + Cnn 


where the cq; have the constant values 


(55b) ci = f(x). 
Combining the coefficients c: into the vector © = (ci, . . . , Cn), we have 
(55c) f(A) =C-A. 


The most general linear form in a vector A is the scalar product of A 
with a suitable constant vector C. 

A function f(A, B) of two vectors A= (a1, . . ., an), B=(61,.. ., 
bn) is called a bilinear form in A, B if f is a linear form in A for fixed 
B and a linear form in B for fixed A; this means that we require that 


(56a) f(A + uB, C) = f(A, C) + pf(B, ©) 
(56b) f(A, 1B + pC) = Af(A, B) + uf(A, C) 


for any vectors A, B, C and scalars A, p. The simplest example of a bi- 
linear form is the scalar product 
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f(A, B)=A-B. 


In this example, the rules (56a, b) just reduce to the associative and 
distributive laws (15b, c), p. 132 for scalar products. 
We find more generally from (56a, b) that 


(56c) f(aA + BB, YC + 5D) = af(A, yC + 5D) + B/B, yC + 5D) 
= ayf(A, C) + adf(A, D) + By/(B, C) + B5f(B, D). 


Thus, we can operate with bilinear forms as with ordinary products in 
“multiplying out”? expressions. Using again the decomposition 


A = (a1, . . . , Qan) = aE + ° + ++ anEn 
B = (bı, . . . , bn) = b1E1 + + + + + bnEn 
for the vectors A, B, we arrive at the formula 
f(A, B) = f(a1Eı + a2E2 + + + + + anEn, 
biEı + b2Ez + + + + + bnEn) 
= 3 asbsf(Es, Ex) 


j.k=1 
Hence, the most general bilinear form in A, B is given by 
(57a) f(A, B) = pH Cjrajbr 

j.k= 


with constant coefficients 
(57b) cir = f(E; Ex). 


For B = A the bilinear form f goes over into the quadratic form 
(57c) f(A, A) = , 2a Cikajak. 


In a similar way one defines trilinear forms f(A, B, C) in three 
vectors A, B, C as functions that are linear forms in each vector 
separately. One finds, exactly as before, that the most general trilinear 
form is given by an expression 


(58a) NA,B,C)= $ Crrajbrcr, 


3 k,r™1 
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where 
(58b) Cer = f(Ej, Ex E»). 


More general multilinear forms f in any number m of vectors can be 
defined in an obvious manner. It is only the matter of notation that 
injects a new element, since we can no longer associate different 
letters with different vectors. We denote the vectors by Ai, Ag,.. ., 
Am and introduce their components ajx by 


Ai = (11, @21,.. . , Ani), Ag = (a12, a22, . . . , an2)... , 
An = (dim, Ams,» + +34 Anm). 
The function f is a multilinear form f(Aı, . . . , Am) in Ai, A2,. ..…, 


Am if it is a linear form in each vector when the others are held fixed. 
We can also consider f as function of the matrix 


a = (Ai, Ag, ee e3 Am) = (ajk) 
that has Aj, A2, . . . , Am as column vectors. In analogy to (58a) the 
most general multilinear form in Ai, A2, . . ., Am is given by 


(59a) f(Aı, A2, . . . , Am) = 2y Cji j2 ° © * dm Q@j11Qjq2* * * Aimm 
Jle pess’ m 
=],..., n 


where? 


(59b) Cie * * im = fE Eye, . . - , Ejim). 


c. Alternating Multilinear Forms. Definition of Determinants 


The determinants of second and third order defined in formulae 
(5la) and (52a) are special multilinear forms. The determinant of 
second order in (51a) p.161 is a bilinear form of the two 2-dimensional 
vectors 


(60a) Ai = (a1, @21), Ao = (diz, a22); 


1The use of subscripts of subscripts in these formulae is somewhat cumbersome. 
Here ji, j2, . . . ,jmstands for any combination of m numbers selected from the set of 
numbers 1,2, . . ., n. Such a combination could also be considered as a function 
j (k) whose domain is the set of numbers k = 1,2, . . . , mand whose range is in the 
set of numbers j = 1,2,...,m. Any one of these combinations or functions gives 
rise to a term in the sum in formula (59a). 
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the determinant of third order in (52a) is a trilinear function of the 
three 3-dimensional vectors 


(60b) Ai = (a11, a21, a31), As = (12, a22, a32), 
A3 = (13, a23, a33). 


(The linearity of determinants in each vector separately follows by 
inspection from the fact that each product in the explicit expansion 
contains exactly one factor with a given second subscript). The extra 
feature that sets the determinants apart from other multilinear 
forms, is their alternating character. 

A function of several arguments (which could be vectors or scalars) 
is called alternating if it just changes in sign, when we interchange 
any two of the arguments. Examples of alternating functions of scalar 
arguments are 


(6la) d(x, y) =y- x 
(61b) B(x, y, z) = (z — y) (z — x) (y — x). 
A function f of two n-dimensional vectors Ai, Ag is alternating if 
f(A1, Az) = — f(A, A1) 
for all Ai, A2. This implies in particular for A; = A2 = A that 
f(A, A) = 0. 


Let n = 2 and f be an alternating function of the vectors Ai, Ae 
given by (60a), which is also a bilinear form. Then 


f(E, E1) = f(Ee, E2) = 0, {(Ee, E1) = — f(E, E2). 


It follows from (57a, b) that 


(62a) f(Ai, A2) = f(ai1E1 + az21E2, a12E1 + az2E2) 
a1 aı2 
= c(a11422 — 12021) = C = c det(Aı, Ag), 
a21 a22 


where the constant c has the value 


(62b) c = f(E, Ee). 
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Thus, every bilinear alternating form of two vectors Ai, Ag in two- 
dimensional space differs from the determinant of the matrix with 
columns Ai, Ag only by a constant factor c. 
More generally, an alternating bilinear form of two vectors in n 
dimensions can be written 
n 


f(Ai, A2) = „2u, CakORAKD, 
pk= 


where 
Cik = —Ckj Cy = 0. 


Combining the terms with subscripts differing only by a permutation, 
we can express f as a linear combination of second-order deter- 
minants: 


n 
(62c) f(Aı, Az) = >) C(ajdk2 — ax1dj2) 
pka 
n aji aki 
= C . 
Ra Ik 
7< k Qj2 ak2 


For an alternating function f of three vectors, we have the re- 
lations 


(63a) f(A, B, C) = —f(B, A, C) = —f(A, C, B) = —f(C, B, A), 
from which it follows that also 
(63b) f(A, B, C) = f(B, C, A) = f(C, A, B). 


In particular, f vanishes whenever two of its arguments are equal. 
Let Ai, Az, As be the three-dimensional vectors given by (60b). By 
(58a, b) the general alternating trilinear form f in Ai, Ae, Ag is 


(Ax, Az, As) = X3 carapiaxadrs 


j.k.r=1 
Here, using (63a, b), 
Cikr = f(y, Ex, E;) = Ejxrf (E1, Ez, Es), 


with ey, = C, if two of the numbers j, k, r are equal and 


Vectors, Matrices, Linear Transformations 169 
(64a) €123 = €231 = €312 =1, €213 = €132 = €321 = — 1. 


Using the fact that the function ¢(x, y, z) in formula (61b) changes 
sign whenever two of its arguments are interchanged, we find for 


Er the concise expression 


(64b) Esr = sgn (J, k, r) 
= sgn (r —k) (r — j) (k — j). 


Comparison with the expression (52a), p. 162 for a third-order determi- 
nant shows that 


a1 a2 a13 
(64c) f(Ai, Az, As) = c | azı a22 a23 |; 


@31 Q32 233 


where c = f(Ei, E2, Es) is a constant. We have the same result as in 
two dimensions: The most general trilinear alternating form in three 
3-dimensional vectors Ai, A2, As differs from the determinant of the 
matrix with columns Ai, Ae, As, only by a constant factor c. Obviously, 
then, the third-order determinant of the matrix with columns Aj, A2, 
As is that uniquely determined trilinear alternating form in the 
vectors Aj, Ag, As that has the value 1 when Aj, Ag, A3 are respectively 
equal to the coordinate vectors Ei, Ee, Es3.1 

It is clear now how we can define determinants of higher order. 
Let a be the matrix 


@11 a12 Ain 

Q21 a22 °. °. 8° An 
(65a) a= | ° ° . ; 

Anl Qn2 °° * > Qnn 
with column vectors Ai, Ag,..., An. Let f be a multilinear alter- 
nating form in Ai, ..., An. Then f is given by (59a). Here the coef- 
ficients Cj;jo.. . j, have the form 
(65b) Chios + + in = (Ez Ez, . . . , Ejn). 
They change sign, whenever we interchange any two of the numbers 
ji, J2, - - . , jn. Denote by ¢(x1,. . . , xn) the product 


1The last condition expresses that the unit matrix e has the determinant 1. 
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(65c) b(x1, X2, . . Xn) 
= (Xn — Xn-1) (Xn — Xn-2) ° © © (Xn — x2) (Xn — xı) 
(Xn-1 — Xn-2)* © * (Xn-1 — x2) (Xn-1 — xı) 


(x3 — x2) (x3 — xı) 


(x2 — xı) 
=| | (xr — xj). 
j. k=1, ye n 
It is easily seen that ¢ is an alternating function of the scalars xı, . . ., 


xn that vanishes only when two of those scalars are equal. Then, 


(65d) Ejijg + + + in = SEN P(J1, J2, - >- » Jn) 


is an alternating function of ji,...,jn, which only assumes the 
values +1, 0, —1. For ji, . . . , jn restricted to the values 1, 2,.. ., n, 
we have &j,j. . . . j, = 0, unless the numbers/ji, . . . , jn are distinct, 
that is, unless they form a permutation of the numbers 1, 2,. . ., n. 
One calls ji, . . ., jn an even permutation of 1,2,..., n if Ej J... . jn 
= +1 and an odd permutation if &,j.. . .j,= —1. An even permutation 
can be rearranged in the order 1, 2,..., n by an even number of 
interchanges of two elements, an odd permutation by an odd number 
of such interchanges. 
Obviously, by (65b), 


(65e) Chijo + + + in = Sie +» + in fn, . . . , En). 


We define the determinant of the matrix a in (65a) as 


Q11 a12 7. . œ Qin 
(66a) det(a) = | @21 @22 °° * Gan 
Anl An2 Ann 
on 
= 2 — E142 . + 6 In Ajy1Ajo2 . > © Ajyn. 
Tp in= 


We have then the result: The most general multilinear alternating 
form f in n n-dimensional vectors Ai, . . ., An differs from the deter- 
minant of the matrix with columns Ai, . . ., An only by the constant 
factor ce =f (E, . . ., En). 
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d. Principal Properties of Determinants 


Formula (66a) gives the explicit expansion of an nth-order deter- 
minant in terms of its n? elements aj. Counting only the terms with 
nonvanishing coefficients ¢,j.. . . jn, the determinant is an nth-degree 
form in the aj consisting of n! terms. Each term (aside from the 
coefficient &,j. . . - jn = + 1) is a product of n of the elements, one from 
each column and from each row. In principle, the expansion formula 
makes it possible to compute a determinant for any given values of 
the elements. In practice, the formula has too many terms to keep 
track of (120 in the case of fifth-order determinants; 3,628,800 in the 
case of tenth-order determinants) to be useful for numerical com- 
putations, and more efficient ways of evaluating determinants have 
been devised. 

The basic properties of determinants already are incorporated in 
our definition as alternating multilinear forms of n vectors Aj, A2, 
. . ., An in n-dimensional space. If a is the matrix with these vectors 
as column vectors, we write 


det(a) = det(Aı, . . . , An). 


It follows immediately that the determinant of the square matrix a 
changes sign if we interchange any two columns of a; in particular, 
the determinant of a matrix a with two identical columns vanishes. 
Using the linearity of the determinant in each of its column vectors 
separately, we find that multiplying one column of the matrix a by a 
factor à has the effect of multiplying the determinant of a by à.1 For 
example, 


(67a) det(AAi, A2, . . . , An) = Adet(Ai, A2, . . . , An). 
In particular, we find for à = 0 and A; arbitrary that 


(67b) det(0, Az, . . . , An) = 0. 


The same considerations apply, of course, to any other column, and 
we find that the determinant of a matrix a vanishes if any column of a 
is the zero vector. From the multilinearity of determinants, we con- 
clude more generally that 


1Multiplying all elements of the nth order matrix a by the factor A is equivalent to 
multiplying each of its n columns by à and, hence, results in multiplying the deter- 
minant of a by A”. Thus, det (Aa) = A” det (a). 
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(67c) det(Aı + Ag, A2, . . . , An) 
= det(Aı, A2, . . . , An) + Adet(Ag, Ag, . . . , An) 
= det(Aı, Az... An), 
since the matrix (A2, A2, . . ., An) has two identical columns. General- 


ly, the value of the determinant of the matrix a does not change if we 
add a multiple of one column of a to a different column.} 

Of fundamental importance is the multiplication law for deter- 
minants: 


The determinant of the product of two nth-order matrices a and b 
is the product of their determinants: 


(68a) det(ab) = det(a) « det(b). 
Written out by elements, the rule takes the form 
aii aiz ** * din bıı big + © e Din 
a21 A22 °» © © An b21 b22 + © © Don 
(68b) e ° ° x o e e 
Anl Qn2 °° © Qnn bni bn2 +s 8 ban 
C11 C12 *• © © Cin 
C21 C22 8 8 © Con 


where 


n 
(68c) Cik = Qjibik + ajzb2k + ° © © + Ajnbnk = 2, Girbre. 
r= 


This law is a simple consequence of our definition of determinants. 
Let c = ab be the product matrix. We hold the matrix a fixed and 
consider the determinant of c in its dependence on b. By (68c) the 
kth-column vector of the matrix c 


Cx = (Cik, C2k) - > 3 Cnk) 
has elements cj, which are linear forms in the kth-column vector Bx 


1Obviously multiplying a column by the factor à and adding it to the same column 
changes the value of the determinant by the factor 1 + A. 
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of the matrix b. It follows that det (c) is a linear form in the vector Br 
when the other columns of b are held fixed. It is also clear that inter- 
changing two columns of b corresponds exactly to interchanging the 
corresponding columns of c. Hence, det(c) is an alternating multi- 
linear form in the column vectors of the matrix b. Consequently 
(see p. 170), 


det(c) = y det(b), 
where y is the value of det (c) for the case where 
Bi = Ei, B: = E2, . . ., Bn = En 


or where b is the unit matrix e. Now, if b = e, then obviously c = 
ab = ae = a, and consequently y = det (a). This proves (68a). 

On p. 157 we defined the transpose aT of the matrix a as the matrix 
obtained from a by interchanging rows and columns. We have the 
surprising fact that a square matrix and its transpose have the same 
determinant: | 


(68d) det(aT) = det(a) 

or 
VARI a21 * * * Qni VARI di2 * » e Qin 
Q12 a22 ee An2 Q21 a22 . e o Q2n 

(68e) e ° = e °. 
Qin An e e o Ann Anl An2 e è o Ann 


For n = 2,3 one easily verifies this identity from the explicit ex- 
pressions (51a), (52a), pp. 161-2. We only indicate the proof for general 
n, which can be based on the expansion formula (66a) for det (a). In 
each term of the sum with nonvanishing coefficient, we can rearrange 
the factors according to the first subscripts, so that 


Qj,1Qjo2 . . . Ajnn = AikyA2ks - . . Ankn, 


where kı, ko, . . . , kn form again a permutation of the numbers 1, 2, 
. ., n.1 One easily shows that 


1Looking at ji, j2, . . . , jn as a function mapping the set 1,2, . . . , n onto itself, we 
have in kı, k2, . . . . , kn just the inverse function; that is, the equation jr = s is 


equivalent to ks = r. 
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&j1j2- + + in = Skike + + + kn 


(this is left as an exercise for the reader). Hence, 


n 
det(a) = 2 — Ekik2 > + + kn@1k,A2k, . . - Ank, = det(a’). 
Ieee Ky=l 


An immediate consequence of formula (68d) is that a determinant can 
be considered as an alternating multilinear function of its row vectors. 
In particular a determinant changes sign if we interchange any two 
rows. 

The multiplication rule (68a) states that the product of the determi- 
nants of two square matrices a, b is equal to the determinant of the 
matrix ab whose elements are the scalar products of the row vectors of 
a with the column vectors of b. We use now that the determinant of a 
matrix a is equal to the determinant of its transpose aT, which is ob- 
tained by interchanging rows and columns of a. It follows then that 


det(a) - det(b) = det(a7) - det(b) = det(aTb). 
Hence, the product of the determinants of the matrices a and b is also 
equal to the determinant of the matrix aTb, obtained by forming the 
scalar products of the columns of a with the columns of b. If 


a = (A, oe ., An) and b= (Bi, . e ty Bn), 


we obtain the identity 


(68f) det(Ai,..., An) - det(Bi,.. ., Bn) 
Ai: Bi Ai - Bo e 6 .Ai* Ba 
Az: Bi A2 Be .. .A2° Bn 
An: Bi An ° Be e à . An ° Bn 


A simple application of these rules to orthogonal matrices a, for 
which [see formula (49), p. 158] a~! = aT or aTa = e, yields 


det(aTa) = det(aT) - det(a) = [det(a)]? = det(e) = 1. 
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Consequently, the determinant of an orthogonal matrix can only have 
the values +1 or —1. The geometric interpretation of this result will 
be given on p. 202. 


e. Application of Determinants to Systems of Linear Equations 


Determinants provide a convenient tool for deciding when n 
vectors Ai, As, ..., An in n-dimensional space are dependent or, 
equivalently, when the square matrix a with columns Aı, . . ., An 
is singular. 


The necessary and sufficient condition for a square matrix to be singular 
is that its determinant vanishes. 

Let indeed a be singular. Then the column vectors Ai, A2, . . ., An 
are dependent. Thus, one of the column vectors, say Ai, is dependent 
on the others: 


Ai = MA2 + AsAg3 + © © © + AnAn. 


It follows from the multilinearity of determinants that 


det(a) = det(AzAz + Az3A3 * + + + AnAn, A2, As,. . ., An) 
= Aedet(Ag, Az, As, . . ., An) + às det(As, A2, As, An), 
+ © «© © +4 Àn det(An, Ag, As, e a o An) 
= 0, 


since each of the matrices has a repeated column.! 
Conversely, if a is nonsingular, there exists (see p. 155) a reciprocal 
b = a^ ofa: 


ab = e, 


where e is the unit matrix. By the multiplication rule for deter- 
minants, it follows that 


det(a) - det(b) = det(e) = 1 


and, hence, that det (a) # 0. This proves that a is singular if and only 
if det(a) = 0. 
We consider now the system of linear equations 


1More generally, this argument shows that an alternating multilinear form in m 
vectors in n-dimensional space vanishes identically for m > n, since then the vectors 
are necessarily dependent. 
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Q11X1 + A12X2 + © © © + AinXn = yı 


(69a) Q21X1 + A22X%2 + ° © © + AgnXn = Y2 


LJ e e e e e e e $ e e e e ° d e e 


QniX1 + Anex2 + °© » © + AnnXn = Yn 


corresponding to the matrix a. Following the discussion on p. 150 we 
have to distinguish the two cases (1) det (a) = 0 and (2) det (a) = 0. 
In case (1) equations (69a) have a unique solution for every yi, . . ., 
yn. In case (2) there does not always exist a solution, and it is never 
unique. We now have not only an explicit test to distinguish between 
the two cases with the help of determinants but also shall find the 
means to calculate the solution in case (1). Introducing the vector 


Y = (y1, ye, * ° *, Yn), 
we can write the system (69a) in the form 
(69b) xıAı + x242 + °°° + xnAn = Y, 


where the Ax are the column vectors of the matrix a. Then, 


det(Y, As, A3, . . . , An) 

= det(x1Ai + xeAe + + e + + xnAn, Az, As, . . ., An) 

= xı det(Aı, A2, As, . . ., An) + x2 det(A2, As, As, . . ., An) 
+ x3 det(As, A2, A3, . . ., An) +°°° 
+ xn det(An, Az A2, . . ., An) 

= xı det(Aı, A2, . . ., An) 


and similarly, 
det(Aı, Y, As, e. An) = X2 det(Aı, A2, e.) An) 


and so on. If the matrix a is nonsingular, we can divide by its deter- 


minant and obtain the solution xı, x2,.. ., Xn expressed by deter- 
minants: 
_ det(Y, As, . . ., An) vo = det(Ai, Y, ..., An) 
xı = det(A, Az, . . ., An)’ 2 = det(Ai, Az, . . ., An)’ 


_ det(A, A2, . . ., Y) 
-o Xn = Jet(Ai,Ag, . . ., An)’ 
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This is Cramer’s rule for the solution of n linear equations in n un- 
known quantities. 


Exercises 2.3 


1. Evaluate the following determinants: 


3.4 5 1141 
(a)|/4 5 6 (c)i2 3 4 
5 6 7 3—1 7 
1 1 1 1 x x’ 
(b);1 2 4 (d) 1 y y? 
1 3 9 1z 23 


. Find the relation that must exist between a, b, c in order that the system 
of equations 


3x + 4y + 5z=a 
4x + 5y + 6z = b 
5x + 6y + Tz=Cc 


may have a solution. 
. (a) Verify that the determinant of the unit matrix is 1. 
(b) Show that if a is nonsingular, then det (a-t) = 1/det (a). 


. Obtain the values of 
(a) £321, (b) ©2143, (c) ©4231, (d) €54321 


. Show that the determinant 


Q Q 
oo 
~ 0 


can always be reduced to the form 


0 
Y 


oO © R 
O © © 


merely by repeated application of the following processes: (1) inter- 
changing two rows or two columns, and (2) adding a multiple of one 
row (or column) to another row (or column). 

. A matrix is diagonal if a; = 0 whenever i + j. Show that the determi- 
nant ofthe n X n diagonal matrix (ay) is the product a11 a22 . . . Ann. 
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7. The matrix (ai) is upper-triangular if an = 0 whenever j < i. Show that 


det(aij) = @11022 * © * Ann. 


8. Evaluate 
(a) 1x x 
ly y’ 
1 z 2 
(b) 1! 2! 3! 
2! 3! 4! 
3! 4! 5! 
(c) 1! 2! 3! 4! 
2! 3! 4! 5! 
3! 4! 5! 6! 
4! 5! 6! 7! 


9. Solve the equations 
2x — 3y + 4z=4 
4x — 9y + 16z = 10 
8x — 2ZTy + 64z = 34. 


10. Prove the identity 
(a? + b?) (c? + d?) = (ac + bd)? + (be — ad)? 


by forming the product of the determinants 


c d 
and 
—b —d c 
11. If A = x? + y2? + 22, B = xy + yz + zx, show that 
B A B 
D=|B B Al=(?4+ y+ 23 — 3xyz)}. 
A BB 


12. Show that 
ttx atx atx atx 
bt+tx ttx atx atx 
b+x b+x ts+x atx 
b+x b+x b+x tatx 


is of the form A + Bx, where A and B are independent of x. By giving 
particular values to x, prove that 


13. 


14. 


15. 


16. 


17. 


18. 
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_ af(b) — bf (a) B= (b) — f (a) 
— ab ’ T b—a 


A 


? 


where 
f(t) = (tı — t) (te — t) (t3 — t) (ta — t). 
Prove that any bilinear form f in A and B may be written 
A « (cB) = (cTA) ° B 
Prove that in a nonsingular affine transformation the image of a quadric 
ax? + by? + cz? + dxy + exz + fyz + gx + hy + iz+j=0 
is another quadric. 
If the three determinants 
bi ba | 


Ci C2 


dı q2 ai q2 


bı bz 


do not all vanish, then the necessary and sufficient condition for the 
existence of a solution of the three equations 


9 b 


Ci C2 


aıx + azy = d 
bix + b2y =e 
cıx + c2y = f 
is 
ai az d 
D=| b be e|/=0. 
a ce f 


State the condition that the two straight lines x = aıt + bı, y = azt 
+ be, z = ast + b3 and x = cıt + dı, y = cat + d2, z = cst + ds 
either intersect or are parallel. 


Prove (68d) by verifying that it does not matter whether the factors in 
each term of the expansion (66a) are ordered by their first or second 
subscripts, namely, with 


Ajil Ajo2* * * Ajnn = Alk A2k2 * * * Ankn, 
that 
Ejjj2 +» -jn = Ekykg +. + kne 
Prove that the affine transformation 
x’ = ax + by + cz 
y = dx + ey + fz 
z’ = gx + hy + kz 


leaves at least one direction unaltered. 
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2.4 Geometrical Interpretation of Determinants 


a. Vector Products and Volumes of Parallelepipeds in Three- 
Dimensional Space 


In Volume I (p. 388) we defined the ‘‘cross product” of two vectors 
A = (a1, a2) and B = (bı, b2) in the plane as the scalar 


(70a) A x B = aib: — a2b1. 


Here |A x B| represents twice the area of the triangle with vertices 


Po, Pi, Ps, where A = PoP:, B = PoP». We call |A x B| the area of 
the parallelogram spanned by the vectors A, B, that is, of the paral- 
lelogram with successive vertices Po, Pi, Q, P2. The sign of A x B 
determines the orientation of the parallelogram.: In determinant no- 
tation the cross product takes the form 


aı bı 


(70b) AxB= = det(A, B). 


a2 2 


Thus, |det(A, B)| can be interpreted geometrically as the area of the 
parallelogram spanned by the vectors A, B. Analogous interpretations 
will be found for higher-order determinants. 
For three vectors A = (a1, az, a3), B = (bı, be, bs), C = (c1, C2, c3) 
in three-dimensional space, it is natural to form the determinant 
aı bı ci 
det(A, B,C) =| a2 be ce 


a3 b3 c3 


Written out as a linear form in the vector C we have, by (52a), 
(71a) det(A, B,C) = (azb3—a3b2)cı + (a3b1 — aıb3)c2 + (aı1b2 — a2b1)c3 
= Z-C, 


where Z = (21, 22, 23) is the vector with components 


a2 bz 
a3 bs 


(71b) Z1 = aebs — asbe = 


1We have A x B > 0 if the sense (counterclockwise or clockwise) in which the 
vertices follow each other is the same as that for the “coordinate square” with 
successive vertices (0, 0), (1, 0), (1, 1,), (0, 1). 
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3 b3 

z2 = a3b1 — aib3 = , 
a bı 
ar bi 

23 = aibe — azbı = . 
az be 


We call the vector Z the “vector product,” or “cross product,” of the 
vectors A, B and write Z = A x B.! Then, by definition, 


(71c) det(A, B,C) = (A x B)- C. 


Because of this formula the scalar det (A, B, C) is sometimes referred 
to as the triple vector product of A, B, C. 

The components z of the vector Z = A x B are themselves second- 
order determinants and, hence, are bilinear alternating forms of 
the vectors A, B. This leads immediately to the laws for vector 
multiplication: 


(72a) (A) x B = A x (AB) = MA x B); 
(72b) (A’+ A”) x B = A’ x B + A” x B; 

A x (B'+B^)=AxB +Ax B” 
(12c) AxB=-BxA 


Relation (72c) could be called the “anticommutative” law of multi- 
plication. It has the important consequence that 


(72d) A x A=0 for all vectors A. 


More generally, the vector product of two vectors A, B vanishes if 
and only if A and B are dependent. For by (71c) the relation A x B 
= 0 is equivalent to 


det(A, B, C) = 0 forall vectors C, 


or to the fact (see p. 175) that A, B, C are dependent for all C. Now we 
can always find a vector C that is independent of A and B (see p. 139) 
Then the dependence of A, B, C implies that A and B are dependent. 


1The vector product of two vectors in three-dimensions is again a vector, in contrast 
to cross products of vectors in two dimensions and scalar products in any number of 
dimensions, which are scalars. 
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The vector product A x B is perpendicular to both of the vectors 
A and B, since by (71c), 


(72e) (A x B)- A = det(A, B, A) = 0, (A x B) - B = det(A, B, B) = 0. 


Hence, for A = PoP, and B = PaP: independent, the direction of A x B 
is one of the two directions perpendicular to any plane PoP1P2 
spanned by A and B. The length of the vector A x B also has a simple 
geometric interpretation. We have, by (71b), 


(72f) |A x B|? = (az2b3 — 3b2)? + (asbi — aibs)? + (aib2 — a2b1)? 
= (a1? + a2? + a3?) (612+ b2? + b3?) 
— (aıbı + azb2 + a3b3)? 
= |A|?|B/? — (A- B}. ! 


Using the fact [formula (14), p. 131] that 
A-B= |A||B] cosy, 


where y is the angle between the directions of A and B, we find from 
(72f) that 


IA x B| = v/AP/BP— [APIBP cos? y = |A|BIsiny 


For A = PoPi, B = PoP: we have in |B|sin y (where y is assigned 
a value between 0 and x) the distance of the point Pz from the line 
PoP: (Fig. 2.6). Hence (exactly as in two dimensions), the quantity 
|A x B| gives the area of the parallelogram with vertices Po, Pi, Q, Pe 
“spanned” by the vectors A, B or twice the area of the triangle with 
vertices Po, Pı, Pe. 

The individual components of the product A x B = (21, z2, 23) also 
can be interpreted geometrically. For example, the expression 


z3 = aibe — azbi 
is just the cross product of the two-dimensional vectors (a1, a2) and 


1This identity incidentally yields an immediate proof of the Cauchy-Schwarz in- 
equality 

IA ° BI £ IA] |B| 
(see p. 132). It also supplies the additional piece of information that the equality sign 
holds if and only if the vectors A and B are dependent. 
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Figure 2.6 Area |A x B| of parallelo- 
gram spanned by two vectors A, B. 


x2 


(E1 + aia + ag,0) 


x] 


Figure 2.7 Components of vector product A x B = 
(21, Z2, 23) interpreted as projected areas. 


(bı, b2) [see (70a)]. If Po has the coordinates &1, &2, 3, we have in |2s| 
the area of the parallelogram in the xı, x2-plane with vertices (1, &2), 
(Er + ai, &2 + a2), (E61 + ai + bı, Es + a2 + be), (E1 + bı, Ee + b2). This 
parallelogram is just the projection onto the xı, x2-plane of the paral- 
lelogram with vertices Po, Pi, Q, P2, spanned in space by the vectors 
A, B (see Fig. 2.7). If A x B has the direction cosines cos ßı, cos Bs, 
cos B3, we have [see (9), p. 129] 


|z3| = |A x B||cos Bs3| 
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Thus | cos B3| gives the ratio of the area of the parallelogram spanned 
by A and B to the area of its projection on the x1, x2-plane. Here B3 
is the angle between the normal of the plane through Po, Pi, Pz and the 
x3-axis. This is, of course, the same angle as that between the plane 
containing the parallelogram spanned by A and B and the x1, x2-plane.! 


If A = PoP: and B = PoP: are independent vectors, we have A x B 


= Por, where the point R lies on the line through Po perpendicular 
to the plane PoP:iP2 and at a distance from Po equal to twice the area 
of the triangle PoP: P2. This fixes R almost uniquely. There are only 
two points with these properties, lying on opposite sides of the plane. 


Which of these points is the end point R of the vector A x B = PoR 
can be decided by the following “continuity” argument. The vector 
product A x B depends continuously on the vectors A, B since its 
components are bilinear functions of those of A, B. Then the direction 
of A x B also depends continuously on A and B, as long as A x B+ 
0, that is, as long as A and B are prevented from becoming 0 or paral- 
lel. We can always change the two vectors A and B continuously 
in such a way that A and B are never 0 or parallel until finally 
A coincides with the coordinate vector Ei = (1,0,0) and B with 
the vector Ez = (0,1,0). This amounts to deforming the triangle 
PoPiP2 continuously and without degeneracy, so that Po goes into 
the origin and Pı, P2 come to lie respectively on the positive xı- 
and x2-axis at the distance 1 from the origin. In the process, the point 
R on the line through Po perpendicular to the plane PoP:P2 never 
crosses that plane. Now, by (71b), 


Ei x Ee = (0,0, 1) = Es 


In a “right-handed” coordinate system, the kind we usually employ, 
the direction of Es is fixed unambiguously as normal to Ei and Ee in 
such a way that the 90° rotation about the x3-axis that takes E: into 
E appears counterclockwise from the point (0,0, 1). Then, generally, if 


our coordinate system is right-handed, the direction of A x B = PoR 
is such that the rotation about the line PoR of the vector A = PoP: 


into the vector B = PoP: by an angle y between 0 and m appears coun- 
terclockwise when viewed from R (see Fig. 2.8). Similarly, in a left- 
handed coordinate system the 90° rotation from Ei; into Ez appears 


1In general, the area of the projection of a plane figure onto a second plane equals the 
product of the area of the original figure with the cosine of the angle between the 
two planes, as will become clear when we discuss transformations of integrals. 
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Figure 2.8 Vector product A x B in 
right-handed coordinate system. 


clockwise from (0, 0, 1), and so also does then the rotation from A into 


B appear from the end point R of A x B = PoR. 
Generally, an ordered triple of three independent vectors A, B, C 


defines a certain sense or orientation. If A = PoP, B = PoP2, and C = 


PoP, we can rotate the direction of A into that of B by an angle be- 
tween 0 and zin the plane PoP:iP:. The sense of the triple A, B, C by 
definition is the sense (counterclockwise or clockwise) that rotation 
appears to have, when viewed from that side of the plane to which C 
points.! The triple B, A, C has the opposite orientation. The orientation 
of the triple A,B, A x B is always the same as that of the coordinate 
vectors En, Ee, Es. 

We call the triple A, B, C oriented positively with respect to the x1, 
x2, x3-coordinate system if it has the same orientation as the triple of 
vectors Ki, E2, Es, and oriented negatively if it has the opposite orien- 
tation. For the triple A, B, C to be oriented positively with respect to the 
x1,x2x3,-coordinates it is necessary and sufficient that 


1The same type of orientation determines the difference between left-handed and 
right-handed screws. The motion of a screw consists of a combination of translatory 
motion along an axis and rotation about that axis. The distinction between the two 
types of screws is defined by the sense of the rotation, clockwise or counterclockwise, 
when viewed from that direction of the axis in which the translation proceeds. 
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(73) det(A, B, C) > 0 


For let A = PoPi, B = PoPi, C = PoP. Relation (73) means that 
(A x B)-C>0, 


that is, that the directions of the vectors A x B and C form an acute 
angle. Since A x Bis normal to the plane PoP:P2, this implies that the 


vector PoP; points to the same side of the plane as the vector A x B. 
Hence, A, B, C and A, B, A x B have the same orientation, which is 
that of Ei, Ee, Es. 

The three independent vectors A, B, C when given the same initial 
point Po “span” a certain parallelepiped, namely, the one that has the 
end points Pi, P2, Ps of A, B, C as vertices adjacent to the vertex Pp. 
We call the parallelepiped oriented positively or negatively with re- 
spect to the x1, x2, xs-coordinate system according to the orientation 
of the triple A, B, C. An interchange of any two of the vectors A, B, C 
reverses the orientation for the parallelepiped spanned by the vec- 
tors.! 

Let 9 be the angle formed by the direction of the vectors C and A x B. 
By (71c), 


(74a) det(A, B,C) = |A x B||C] cos 0 


Figure 2.9 Volume V = JA x B|h of parallelepiped. 


1The orientation of the parallelepiped can be visualized as an orientation ascribed to 
each face of the parallelepiped (i.e., as a sense assigned to the boundary polygon of 
the face) such that a common edge of two neighboring faces is assigned opposite 
senses in the orientation of the two faces. The orientation of all faces is determined 
uniquely if for a single face the sense of one edge is prescribed. For the orientation 
of the parallelepiped spanned by A, B, C, the sense of the edge PoP: in the face 


i —— . . . 
spanned by the vectors PoP2 and PoP: is that of proceding from Po to Pı (see Fig. 2.9). 
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Since A x B is perpendicular to the plane PoP: P2, the angle between 
the line PoP3 and the plane PoPiP2 is 4r — 0. Thus, 


sin (5 — o) | 


(74b) h = |C||cos 8] = |C||sin(5 


is the distance of the point Ps from the plane PoPPs, that is the al- 
titude of the parallelepiped from P3. Since the volume V of the paral- 
lelepiped is equal to the area |A x B| of one face multiplied with the 
corresponding altitude h, it follows from (74a, b) that 


(74c) V= |A x B|h = |det(A,B,C)]. 


In words, the volume of a parallelepiped spanned by three vectors A, 
B, C is the absolute value of the determinant of the matrix with columns 
A, B, C. Thus, the value of det(A, B, C) determines both the volume 
and the orientation of the parallelepiped spanned by A, B, C. We 
express this fact by the formula 


(74) det(A, B, C) = eV, 


where V is the volume of the parallelepiped spanned by the vectors 
A, B, C and £ = +1 if the parallelepiped is oriented positively with 
respect to %1,X2x3,-coordinates and € = —1 if oriented negatively. 


b. Expansion of a Determinant with Respect to a Column. Vector 
Products in Higher Dimensions 


Only in three dimensions can we define a product A x B of two vec- 
tors A, B that again is a vector.! The closest analogue in n-dimensions 
would be a “vector product” of n — 1 vectors. Taking n vectors, 


Ai = (a1, . ., Ani), . ., An = (din, . ., Ann) 


in n-dimensional space, we can form the determinant of the matrix 
(Ai, . . ., An) with those vectors as columns. The determinant of this 
matrix is a linear form in the last vector An and can be written as a 
scalar product 


(75) det(Ai, . . ., An) = 2101 + Z202 + + + © + Znan = Z ° An, 


lIn higher dimensions we cannot associate with two vectors A, B a third vector 
C outside the plane spanned by A, B in a geometric fashion, that is, by a construction 
that determines C uniquely and does not change under rigid motions. 
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where the vector Z = (21, . . ., Zn) depends only on the n — 1 vectors 
Ai, Ag, . . ., An-1. Obviously, Zis linear in each of the vectors Aj, . . ., 
An-1 separately and is alternating. We can call Z the vector product 
of Ai,...,An-1 and denote it by 


(76) Z = Ai X Az xX +» © X An-1. 
It is clear from (75) that 
Z. Ai=Z. 42: =... =Z. An1=0; 


we see that the vector product of n — 1 vectors is orthogonal to each 
of the vectors, as in three dimensions. The length of the vector product 
Z also can be interpreted geometrically as volume of the oriented 
(n — 1)-dimensional parallelepiped spanned by the vectors Ai, . .., 
An_1, as we shall see later. 

Just as in three dimensions, the components of Z can be written 
as determinants in analogy to formulae (71b). We first derive such 
a determinant expression for the component zn of Z. By (75), 


Zn = Z. En = det(Aj, ee ey An-1, En), 
where 
En = (0,0, e. o9 0,1) 


is the n-th coordinate vector. Taking An = En in the general ex- 
pansion formula (66a) p.170 for determinants amounts to replacing the 
last factor aj,n in each term by 1 for jn = n and by 0 for jn + n. For 
jn = n the coefficient €j; . . . j,-14, vanishes, unless jı, . . ., jn-1 
constitute a permutation of the numbers 1, 2,..., n — 1. In that 
case, the coefficient (65c, d) reduces to 


Eji -> © jn-1jn = Eji - + + jn-yn = sgn ¢ (jı, e. ., J-1, n) 
= sgn (n — Jn-1)* + + (M— Ji) (jı, - - -,jn-1) 
= sgn ¢ (ji, . -s Jn-1) = Eji. ©. in-1 


It follows from (66a) that 


re | 
(77a) Zn = 2 Ëj © © * jn—1Qjy1 Ajg2 * * * Ajn—yn-1 
Jien- 
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Q11 Q12 »~ « . Ql n-l 
Q21 Q22 e. > « Q2 n-l 
AQn-11Qn-12 ...Qn-1 n-1 


We see that Zn is equal to the determinant of the matrix obtained 
from the matrix (Ai, . . ., An) by omitting the last row and column. 
Generally, one defines a minor of a matrix a as the determinant of 
a square matrix obtained from a by omitting some of the rows and 
columns, whilé preserving the relative positions of the remaining 
elements. The minor complementary to an element ajk of a square 
matrix a is the one obtained by omitting from a the row and column 
containing the element ajx. Thus Zn is equal to the minor comple- 
mentary to ann. 

The other components of the vector Z have similar representations. 
We have, for example, by (75), 


Zn-1 = det(Aı, e e eg An-1, Ein-1). 


To evaluate this determinant, we interchange the last two rows (see 
p. 174) which changes the sign of the determinant. The last column 
En-ı then goes over into En, and we find from our previous result that 
—Zn-1 is equal to the determinant obtained by omitting the last row 
and column of the new matrix or, equivalently, is equal to the minor 
complementary to the element an-ı n in the original matrix. Similarly, 
one finds that +z; for each i =1,.. .,mnis equal to the minor com- 
plementary to the element ain, where the positive sign applies for 
n — i even, the negative one for n — i odd. 

Formula (75) thus constitutes an expansion of an nth-order deter- 
minant in terms of (n — 1)-order determinants, the minors com- 
plementary to the elements in the last column. For example, for 
n= 4 we have the formula 


Q@i1 Q12 Q13 14 
Tb a21 22 Q23 Q24 
(77b) a31 Q32 433 434 
Q41 Q42 Q43 44 

a21 Q22 Q23 Q@i1 12 13 

= —@14| @31 Q32 Q33 | + Q24 | A31 Q32 433 


@41 Q42 Q43 G41 Q42 Q43 
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Q11 Q12 413 Q11 Q12 13 
— Q34| Q21 Q22 Q23 | + Q44 | a21 G22 23 


Q41 Q42 Q43 Q31 Q32 Q33 


Interchanging columns, we can derive similar formulae for ex- 
panding a determinant in terms of the minors complementary to 
the elements of any given column. Expansions of this type play a role 
in many proofs that involve induction over the dimension of the space, 
as we shall see in the next sections. 


c. Areas of Parallelograms and Volumes of Parallelepipeds in Higher 
Dimensions 


Surfaces in space can be built up from infinitesimal parallelo- 
grams. Thus, formulae for areas of curved surfaces and for integrals 
over surfaces require knowledge of an expression for the area of a 
parallelogram in space. Similarly, formulae for volumes or volume 
integrals over curved manifolds have to be based on expressions for 
volumes of parallelepipeds in higher dimensions. Such expressions are 
easily derived in greatest generality with the help of determinants. 

The basic quantity associated with vectors is the scalar product 
of two vectors 


A = (a1, ee ey An) and B = (bi, e. ey bn), 
which in any Cartesian coordinate system is given by 
A - B = aibı + ° + © @nbn. 


While the individual components a; and bx of A and B depend on the 
special Cartesian coordinate system used, the scalar product has an 
independent geometric meaning: 


A B= |A||B] cosy, 


where |A|, |B| are the lengths of the vectors A and B, and y the 
angle between them. If follows that any quantity that can be express- 
ed in terms of scalar products has an invariant geometric meaning 
and does not depend on the special Cartesian coordinate system 
used. 

The simplest quantity expressible in terms of scalar products is the 
distance of two points Po, Pı which is the length of the vector A = 


PoP. The square of that distance is given by 
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(78a) |AJ?=A-A. 


With two vectors A, B in n-dimensional space, we can associate the 
area of a parallelogram spanned by the two vectors if we give them a 


common initial point Po. Let A = PoP\ and B = PoP. The vectors 
then span a parallelogram Po, Pi, Q, P2 that has Pı and Pz as vertices 
adjacent to the vertex Po. By elementary geometry the area a of the 
parallelogram is equal to the product of adjacent sides multiplied by 
the sine of the included angle y: 


a = |A||B| sin y 
= VJAPIBP -TAPBP costy 
= v| AP|BP-(A - B} 


as we found already on p. 182 for the special case n = 3. We can write 
this formula for the area a more elegantly in the form of a deter- 
minant for the square of a: 


A-A A-B 


78b 2—(A- AXB. B) — (A- B)\(B: A) = 
(78) a? = (A+ A(B-B)—(A-BYB- A=) 
The determinant that appears here on the right-hand side is called 
the Gram determinant of the vectors A, B and denoted by /(A, B). 
It is clear from the derivation that | 


r(A, B) 2 0 


for all vectors A, B and that equality holds only if A and B are 
dependent.! 

We can derive a similar expression for the square of the volume V 
of a parallelepiped spanned by three vectors A, B, C in n-dimensional 
space. We represent the vectors in the form 


A = PoP, B=PoP, C= PoP 


and consider the parallelepiped that has Pı, P2, P3 as vertices ad- 
jacent to the vertex Po. Its volume V can be defined as the product 
of the area a of one of its faces multiplied by the corresponding 
altitude h. Choosing for a the area of the parallelogram spanned 


1That is, if either one of the vectors vanishes (|A| or |B| = 0) or if they are parallel 
(sin y = 0). 
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by the vectors A and B, we have to take for h the distance of the 
point Ps from the plane through Po, Pı, Pe. Thus, 
A-A A.-B 


V2 = h?a? = RI(A, B) = R? . 
B.-A B.B 


We interpret h to stand for the “perpendicular” distance of P3 from 


the plane Po Pı P2, that is, the length of that vector D = PP} which 
is perpendicular to the plane and has its initial point P in the plane. 


For a point P in the plane PoP1P2 the vector PoP must be dependent on, 
A = PoP; and B = PoP: (see p. 144): 


PoP =A + uB. 


Hence, the vector D has the form 
D = PP; = PoP — PoP = C — àA — uB 


with suitable constants à,u. If D is to be perpendicular to the plane 
spanned by A and B, we must have 


(79a) | A-D=0Q, B-D=0. 
This leads to a system of linear equations for determining à and pi: 
(79) A-C=AA-A+HBA-B, B-C=AB-A+uB-B. 


The determinant of these equations is just the Gram determinant 
(A, B). Assuming A and B to be independent vectors, we have 
r(A, B)~+0. There exists, then, a uniquely determined solution 


à, p of equations (79) and, hence, a unique vector D = PP; per- 
pendicular to the plane PoPiP: and with initial point in that plane. 
The length of that vector is equal to the distance h, so that by (79a) 


h? = |D? =D. D= (C — 2A — uB): D 
=C.-D-—2,A-D-pB.-D 
=C-D=C-C-—-AC-A—-uC-B. 

This results in the expression 


(790) V2 = (C - C — 1A • C — pB - C) T(A,B). 
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This expression for the square of the volume of the parallelepiped 
spanned by A, B, C can be written more elegantly as the Gram 
determinant formed from the vectors A, B, C: 


A-A A.B A.C 
(79d) V2?=| B-A B.B B-C | = I(A,B,C). 

C-A C.-B C.C 
To show the identity of the expressions (79c) and (79d) for V2, we 
make use of the fact that the value of the determinant I(A, B, C) 


does not change if we subtract from the last column (-times the first 
column and -times the second column: 


A-A A-B A-C—iAA-A-—HBA-B 
I(A,B,C) =| B- A B-B B-C—AB-A-—uB-B |. 
C-A C.B C-C-—i’AC-A-—-pC-B 


It follows from (79b) that 


A-A A-B 0 
(A,B,C) =|B-A B-B 0 , 
C-A C.-B C-C—AC-A—-uC-B 


Expanding this determinant in terms of the last column leads back 
immediately to the expression (79c). 

Formula (79d) shows that the volume V of the parallelepiped spanned 
by the vectors A, B, C does not depend on the choice of the face and of 
the corresponding altitude used in the computation, for the value of 
I(A, B, C) does not change when we permute A, B, C. For example, 
I(B, A, C) can be obtained by interchanging in the determinant 
for T (A, B, C) the first two rows and then the first two columns. 

Formula (79c) can be written as 


r(A, B, C) = |D|27 (A,B). 
It follows that 
T(A,B, C) 2 0 
for any vectors A, B, C. Here the equal sign can only hold if either 


(A, B) = 0 or D = 0. The relation [(A, B) = 0 would imply that 
A and B are dependent. If D = 0, we would have C = AA + uB, so 
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that C would depend on A and B. Hence the Gram determinant 
r(A, B, C) vanishes if and only if the vectors A, B, C are dependent. 

For n = 3 formula (79d) follows immediately from the formula 
(74c) for the volume V of an oriented parallelepiped spanned by three 
vectors A, B, C in three-dimensional space. This is a consequence of 
identity (68f) p. 174 according to which 


det(A, B, C) det(A, B, ©) = T(A, B, ©). 


The expression for V? as a Gram determinant has the advantage of 
showing that V is independent of the special cartesian coordinate 
system used, and hence that V has a geometrical meaning. 

We can proceed to “volumes” V of four-dimensional parallelepipeds 


spanned by four vectors A = PoP, B = PoPs, C = PoPs, D = PoP: 
in n-dimensional space (n 2 4). Defining V as the product of the 
volume of the three-dimensional parallelepiped spanned by the three 
vectors A, B, C with the distance of the point P, from the three- 
dimensional “plane” through the points Po, Pi, Pe, Ps, we arrive by the 
exactly same steps as before at an expression for V? as a Gram deter- 
minant: 


A-A A.B A.C A.D 
B-A B.-B B.C B-D 
(80a) V2 = = r(A, B,C, D) 
C-A C-B C.C C.D 
D-A D-B DC D-D 


If here n = 4, the Gram determinant becomes the square of the de- 
terminant of the matrix with columns A, B, C, D, and we find that 


(80b) V = |det(A, B, C, D)|. 


More generally, m vectors Ai, ..., Am in n-dimensional space, 
to which we assign a common initial point Po, span an m-dimensional 
parallelepiped. The square of the volume V of that parallelepiped is 
given by the Gram determinant 

Ai °. Al Ai: Ag © © © Ål ° Åm 
Ao: Al Ao « Ag oe > © Å ° An 


(81a) V? = ° ° ° = [(Aı, . . ., Am) 


Am:Ai Am:Azg >œ » > Åm’ Åm 
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For m = n we obtain for the volume of the parallelepiped spanned by 
n vectors in n-space the formula 


(81b) V = |det(Ai, . . ., An). 
One proves by induction over m that 
r(Aıi, . . ., Am) 2 0, 


where equality holds if and only if the vectors Ai,..., Am are 
dependent.! 


d. Orientation of Parallelepipeds in n-Dimensional Space 


Later on, in Chapter 5, when we need a consistent method to fix 
the sign of multiple integrals, we have to make use of signed volumes 
and orientations of parallelepipeds in n-dimensional space. | 

For the volume spanned by n vectors Aj, . . ., An in n-dimensional 
space we have by (81b) the expression 


V = |det Ai,..., An)|. 


We call det (Ai, . . ., An) the volume in (xı + + + Xn)-coordinates of 
the oriented parallelepiped spanned by Aj, ..., An. The parallel- 
epiped or the set of vectors Ai, . . ., Anis called positively oriented 
with respect to the coordinate system if det (Ai, . . ., An) is positive, 
negatively if the determinant is negative. Thus, 


(81c) det(Aı, . . ., An) = EV, 


where V is the volume of the parallelepiped spanned by the vectors 
Aı,. . ., An and € = +1 or —1 according to whether the parallelepi- 
ped is oriented positively or negatively with respect to the coordinate 
system. 

While the square of det (Ai, . . ., An) has a geometrical meaning 
independent of the Cartesian coordinate system, this is not the case 
for the sign of the determinant. Interchanging, for example, the 
xı- and xz-axes results in the interchange of the first two rows of the 
determinant and, hence, in a change of sign in det(Ai,..., An). 
What has an independent geometric meaning, however, is the state- 


1In the case of dependent vectors Ai, . . ., Am with common initial point Po the 
parallelepiped spanned by these vectors “collapses” into a linear manifold of m-1 
dimensions or less and has m-dimensional volume equal to 0. 
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ment that two n-dimensional parallelepipeds in n-dimensional space 
have the same or have the opposite orientation. 

Consider two ordered sets of vectors Ai,..., Anand Bi, . . ., Bn 
in n-dimensional space, where we assume that each set consists of 
independent vectors. Obviously, the two sets have the same orienta- 
tion—that is, are both oriented positively or both negatively with 
respect to the xı + + + xn-system—if and only if the condition 


(82a) det(Ai, . . ., An) - det(Bi,. . ., Bn) > 0 


is satisfied. Using the identity (68f), we can write this condition in the 
form 


(82b) [Ai,..., An; Bi, . . ., Ba] > 0, - 


where the symbol on the left denotes the function of 2n vectors defined 
by 


Aı -Bı Ai+Bo -+++Ai+Bna 
A2 Bi Az2+-Beo »». A- Bn 


(82c) [Ai,. . ., An; Bi, . . ., Bn] = - . ; 


An » Bi An. B2 «++An: Ba 


Notice that for Bı = Ai,..., Bn = An the symbol [Aı, . . ., An; 
Bı, . . ., Bn] reduces to the Gram determinant T(Aı,. . ., An). 
Formulae (82b, c) make it evident that having the same orientation is 
a geometric property that does not depend on the specific Cartesian 
coordinate system used. We denote this property symbolically by 


(82d) Q(Aı, . . ., An) = Q(Bı, . . ., Bn) 
and the property of having the opposite orientation? by 


1The individual orientation © of an n-tuple of vectors does not stand for a “number.” 
Formula (82f) only associates a value +1 with the ratio of two orientations,. while 
formulae (82d, e) express equality or inequality of orientations. It is, of course, 
possible to describe the two different possible orientations of n-tuples completely by 
numerical values, say, giving the value Q = +1 to one orientation, the value Q = 
—1to the other. This involves, however, the arbitrary selection of a “standard 
orientation” we call +1—for example, that given by the coordinate vectors— 
whereas the relations (82d, e, f) are meaningful independent of any numerical value 
assigned to Q. Analogous situations are common throughout mathematics. For 
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(82e) Q(Aı, e. o An) = —Q(B;, ee e Bn). 


Then, generally, for two sets of n independent vectors in n-dimen- 
sional space, 


(82f) Q(B;, ee og Bn) = sgn[A];, s. o3 An; Bi, e. o3 Ba]Q(Aı, e. o An). 


The set Ai, . . ., An is oriented positively or negatively with respect 
to xı ° * * Xn,-coordinates according to whether 


(83a) Q(Aı, . . ., An) = QE, . . ., En) 

or 

(83b) Q(Aı, . . ., An) = —Q(Ei, . . ., En), 

where Ei, . . ., En are the coordinate vectors. On occasion, we shall 

denote the orientation Q(Eı, . . ., En) of the coordinate system by 
Q(x1, X2, . . ., Xn). 

For two sets of n vectors in n-dimensional space Aj, ..., An and 

Ai’, . . ., An’ we have by (82c), (81b) 

(84a) [Ai,...,An; Ar’,.. ., An’] = se’ VV’ 


Here V and V’ are, respectively, the volumes of the parallelepipeds 
spanned by the two sets of vectors; the factors £, e’ depend on their 
orientations and those of the coordinate vectors: 


(84b) e = sgn [A;, . . ., An; Ei, . . ., En] 


(84c) e’ = sgn [Ai’,. . ., An’; Ei, . . ., En]. 


The product 
(84d) ce’ = sgn [A;, . . ., An; A1, . . ., An] 


example, in euclidean geometry, equality of distances and even the ratio of distances 
have a meaning even when no numerical values are assigned to the distances (as in 
Kuclid’s Elements). It is true that we can describe distances by real numbers, such 
that the ratio of distances is just that of the corresponding real numbers. This 
requires the arbitrary selection of a “standard distance” (e.g., a meter), to which all 
other distances are referred, and thus introduces in some sense a “nongeometrical” 
element. 
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is independent of the choice of the coordinate system and has the 
value +1 if the parallelepipeds have the same orientation but —1 if 
the opposite orientation. 

Using the definition in terms of scalar products, we can form the 
expression 


(85a) [Ai,.. ., Am; Ai’, . . ., Am’ 


Ai - Ai’ Ai « Ao’ oe e Ay © An’ 
As + Ai’ Ag « Ad’ oe e As - A,’ 


Am + Aj’ Am ° Ao’ © e © Am * An’ 


for any 2m vectors Ai, . . ., Am’ in n-dimensional space. It is clear 
from the definition that this expression is a multilinear form in the 
2m vectors. For example, the vector Ai’ occurs only in the first column 
and the elements of that column are linear forms in Aj’. Since the 
whole determinant is a linear form in the elements of the first column, 
it follows that it is a linear form in Aj’. It also is evident from (85a) 
that the expression is an alternating function of the vectors A1’,.. ., 
Am’ for fixed Ai, . . ., Am and an alternating function of Ai... ., Am 
for fixed Ai’,. . Am. It follows (see the footnote on p. 000) that 


(85b) [Ai,..., Am; Ai’, . . ., Am] = 0 


whenever the m vectors Ai, . . ., Am or the m vectors Aj’, . . ., Am’ 
are dependent. In particular (85b) always holds when m > n. 
Assume then that m < n and that the vectors Ai, . . ., Am and the 
vectors Ai’, . . ., Am’ are independent. We can assume that all these 
vectors are given the same initial point, say the origin O of n-dimen- 
sional space. Then Aj, . . ., Am span an m-dimensional linear manifold 
nm through O and Aj’,.. ., Am’ another such plane n’. Introduce an 
orthonormal system of vectors Ei, . . ., Em as coordinate vectors in 
m and another orthonormal system of vectors Ei’,..., Em’ in n.1 
For fixed Ai, . . ., Am the function (85b) is an alternating multilinear 
form in the vectors Ai’, . . ., Am’ and, hence (see p. 149), is given by 


1These two systems of coordinate vectors in n and 7’ do not have to be related 
to each other in any way nor to the coordinate system to which the whole n-di- 
mensional space containing nr and n’ is referred. 
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[Aq, . ., Åm; Ay, ee o Anm] 

= [A;, . . ., Am; Er, . . ., Em] det(Ar’,. . ., Am’), 
where det (A1’, . . ., Am’) is the determinant of the matrix formed by 
the components of the vectors Ai’, . . ., Am’ referred to Ey’, . . ., Em’ 
as coordinate vectors. Obviously the coefficient [Ai1,..., Am; 
Er, . . ., Em’) itself is an alternating multilinear form in Ai, . . ., Am 


and, hence, given by 
[E1 . . ., Em; Ey’, . . ., Em] det(Ai, . . ., Am), 


where the last determinant is formed from the matrix of components 
of Ai, . . ., Am referred to the coordinate vectors Fi, . . ., Em. 
Using formula (81c), we obtain the identity 


(85c) [A1,..., Am; Ai’, . . ., Am’] = pee’ VV’. 


Here V and V’ are respectively the volumes of the parallelepipeds 
spanned by the vectors Ai,..., Am and Ai’,..., Am’. The factors 
€, & relate the orientations of the parallelepipeds to those of the 
coordinate systems in mt and 7’: 


e€ = sgn [Ai,..., Am; Ei, . . ., En], 
€ = sgn Ar, . . ., Am’; Er, . . ., En]. 
Finally, the coefficient 
H = [Ei,. . ., Em; Er, . . ., En’] 


depends only on the spaces m and n’ and the coordinate systems 
chosen in those spaces. If z = z’ we can choose 


EK’ = ki, . . ., Em’ = En; 


in that case p = 1, as in formula (84a). 

For u + 0, we can use formula (85c) to relate orientations in two 
distinct m-dimensional linear manifoldsx and n’, both lying in the same 
n-dimensional space.! Replacing, if necessary, one of the coordinate 


One verifies easily that u = Oonly when x and n’ are perpendicular to each other, 
that is, when 7’ contains a vector orthogonal to all vectors in n. More generally, 
the coefficient u can be interpreted as cosine of the angle between the two manifolds 
(see problem 13, p. 203). 
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vectors by its opposite, we can always contrive that u > 0. Then, 
by (85c), 


(85d) sgn [Ai,..., Am; Ai’,. . ., Am] = &8’ 
Thus, the condition 
[Ai,..., Am; AY, . . ., Am] >0 


for any Ai,..., Am in T and Ai’, . . ., Am’ in 7’ signifies that both 
sets of vectors are oriented positively or both oriented negatively 
with respect to the coordinate systems in those spaces. 


e. Orientation of Planes and Hyperplanes 


The choice of a particular Cartesian coordinate system in an m- 
dimensional linear manifold n determines a certain orientation 


Q(Eı, e e ey En), 


where Ei, . . ., Em are the coordinate vectors. This choice fixes which 
sets of m vectors Ai,..., Amin mt are called positively oriented, 
namely, those with the same orientation as Ki, . . ., Em. We denote 
by x* the combination of the linear space n with the selection of a 
particular orientation in ņ and call n* an oriented linear manifold. We 
write QO(x*) for the selected orientation and call m independent 
vectors Ai, . . ., Amin T oriented positively if 


OQ(Ai, . . ., Am) = Q(x*). 


We call n* oriented positively with respect to a particular Cartesian 
coordinate system if the orientation of the coordinate vectors is the 
same as that of n*. 

An oriented two-dimensional plane =* can be visualized as a 
plane with a distinguished positive sense of rotation. If a pair of vectors 
A, B is oriented “positively” with respect to n*, the positive sense 
of rotation of n* is the sense of the rotation by an angle less than 180° 
that takes the direction of A into that of B.} 

If the oriented two-dimensional plane n* lies in an oriented three- 
dimensional plane o*, we can distinguish a positive and negative side 


1Notice that the orientation of n* can only be described by pointing out a specific 
positively oriented pair of vectors B, C in z or a specific rotating object in 7 (e.g., 
a clock) that has the distinguished sense of rotation. There is no abstract way of 
deciding whether a given rotation is clockwise or counterclockwise, anymore than 
there is an abstract way of saying which is the right and which the left side. These 
questions can only be decided by reference to some standard objects. 
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of n*. Let Po be any point of n*. We take two independent vectors 
B = PoP, C = PoP2 in «* for which 


(86a) Q(B, C) = Q(r*). 


A third vector A = PoP, independent of B, C is said to point to the 
positive side of n* if 


(86b) Q(A, B, ©) = Q(0*). 


If o* is oriented positively with respect to a Cartesian coordinate 
system, we can replace condition (86b) by 


(86c) det(A, B, C) > 0 


in that system. If o* is oriented positively with respect to the usual 
right-handed coordinate system, then the positive side of an oriented 
plane m* is the one from which the positive sense of rotation in n* 
appears counterclockwise. 

The same terminology applies to oriented hyperplanes n* in 


n-dimensional oriented space o*. Given n — 1 vectors Ag,..., An 
in <* with 
(87a) Q(A2, . . ., An) = Q(n*), 


a vector Ai is said to point to the positive side of n*, if 


(87b) Q(Aı, . . ., An-1, An) = Q(0*), 


f. Change of Volume of Parallelepipeds in Linear Transformations 


A square matrix a = (ajx) with n rows and columns determines a 
linear transformation or mapping Y = aX of vectors X in n-dimen- 
sional space into vectors Y of the same space. Here we assume that 
X and Y are referred to the same coordinate vectors Ei, . . . , En. For 
X = (x1,..., Xn), Y=(y1,..., yn) the transformation, written 
out by components, has the form 


n e 
yi = Qi dirke (j=1,...,n). 
A set of n vectors Bi = (bun, . . ., bn1),. . ., Bn = (bin, . . ., bnn) is 
transformed into the set of n vectors Ci = (c11, . . ., Cni), . . -> Cn = 


(Cin, . . ., Cnn), where 
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Chk = 2 Ajrbrk 
T= 


By the rule for the determinant of a product of matrices (p. 172), we 
have 


(88a) det(Ci, . . ., Cn) = det(a) - det(Bi, . . ., Bn) 
This formula contains the two formulae 
(88b) |det(Ci, . . ., Cn)| =|det(a)| |det(Bi, . . ., Bx)| 
(88c) sgn det(Ci,.. ., Cn) = [sgn det(a)][sgn det(Bi, . . ., Bn). 


These two rules can be formulated immediately in geometrical lan- 
guage: 


The linear transformation of n-dimensional space into itself cor- 
responding to a square matrix a multiplies the volume of every 
parallelepiped spanned by n vectors by the same constant factor | det(a) |. 
It preserves the orientation of all n-dimensional parallelepipeds, if 
det (a) > 0, and changes the orientation of all of them if det (a) < 0.1 

For a rigid motion, the matrix a is orthogonal and, hence (see p. 
175), has determinant +1 or —1. Thus, rigid motions preserve the 
volume of parallelepipeds. Those for which det (a) = +1 preserve 
sense; the others invert it. 


Exercises 2.4 


1. Treat number 5 of Exercises 2.2 in terms of vector products. 

2. In a uniform rotation let («, 8, y) be the direction cosines of the axis of 
rotation, which passes through the origin, and œ the angular velocity. 
Find the velocity of the point (x, y, z). 

3. Show that the plane through the three points (xı, yı, 21), (x2, ye, 22), 
(x3, Y3, 23) is given by 


xı— x% yı—y 2-2 
x2 — x y2—y zz—z|=0. 
x3 — X% y3— y zZz3— z 


1]t is important to emphasize the assumptions in this theorem. Only volumes of n- 
dimensional parallelepipeds are multiplied by the same factor; lower-dimensional 
ones are multiplied by factors that vary with their location. Also, we have to assume 
that image and original refer to the same coordinate system if the statement about 
orientations is to hold. 


10. 


11. 


12. 


13. 
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. Find the shortest distance between two straight lines / and l’ in space, 


given by the equations x =at + b, y=ct+d, z=et+f and x= 
at+bU,y=cit+d,z=et+f". 


. Show that the area of a convex polygon with the successive vertices 
Pi(x1, y1), Pe(xe2,y2), . . .,Pn(xn,yn)is given by half the absolute value of 
X1 X2 X2 X3 tee et Xn-1 Xn Xn Xi 
yı y2 yo y3 Yn-1 Yn Yn Yı 
. Prove that the area of the triangle with vertices (x1, yı), (x2, y2), and 
(x3, y3) iS 
xı yı 1 
2 X2 Y2 1 
x3 y3 1 
. If the vertices of the triangle of the preceding exercise have rational 


coordinates, prove the triangle cannot be equilateral. 


. (a) Prove the inequality 
a b c 
D=|\|@ V E | SVa +b 4+ Ala + b + c2)\(a"? + OH eD). 
a" b" c" 


(b) When does the equality sign hold? 


. Prove the vector identities 


(a) A X (B xX C) = (AC) B— (A+B) C 
(b) (X xY). (X x ¥) = (K+ X) (YX Y’) — (CX * ¥) (¥ + X’) 
(c) [X x (© x Z)] + {[Y x (Z x X)] x [Z x (K x Y)]} = 0. 


Give the formula for a rotation through the angle ¢ about the axis 
x:y: z = 1:0: —1 such that the rotation of the plane x = z is positive 
when looked at from the point (—1, 0, 1). 

If A, B, and C are independent, use the two representations of X = 
(A x B) x (C x D) obtained from Exercise 9a to express D as a linear 
combination of A, B, and C. 

Let Ox, Oy, Oz and Ox’, Oy’, Oz’ be two right-handed coordinate 
systems. Assume that Oz and Oz’ do not coincide; let the angle zOz2’ be 
0 (0 < 0 < r). Draw the half-line Ox: at right angles to both Oz and Oz’ 
and such that the system Ox1, Oz, Oz’ has the same orientation as Ox, 
Oy, Oz. The Ox: is the line of intersection of the planes Oxy and Ox’y’. 
Let the angle x0x1 be ¢ and the angle x1Ox’ be y4 and let them be meas- 
ured in the usual positive sense in their respective planes, Oxy and 
Ox’y’. Find the matrix for the change of coordinates. 


Let r and x’ be two m-dimensional linear subspaces of the same n- 
dimensional space with respective orthonormal bases Ei, Ee,.. ., 
Em and Ey’, Ee’,.. ., Em’. Show that u = [E;, Ee, . . ., Em; Ey’, Ez’, 
. e o Em] = 0 if and only if z and 7’ are orthogonal, that is, one space 
contains a vector perpendicular to all the vectors of the other. 
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2.5 Vector Notions in Analysis 
a. Vector Fields 


Mathematical analysis comes into play when we are concerned 
with a vector manifold depending on one or more continuously vary- 
ing parameters. 

If, for example, we consider a material occupying a portionof space 
and in a state of motion, then at a given instant each particle of the 
material will have a definite velocity represented by a vector U = 
(u1, u2, u3). We say that these vectors form a vector field in the region 
in question. The three components of the field vector then appear as 
three functions 


Ui(X1, X2, X3), U2(xX1, X2, X3), Us(X1, X2, X3) 


of the three coordinates x1, x2, x3 of the position of the particle at the 
instant in question. We would usually represent U as a vector with 
initial point (x1, x2, x3). 

The forces acting at different points of space likewise form a vector 
field. As an example of a force field we consider the gravitational force 
per unit mass exerted by a heavy particle, according to Newton’s law 
of attraction. According to that law the field vector F = (fi, fe, fs) at 
each point (x1, x2, x3) is directed toward the attracting particle, and 
its magnitude is inversely proportional to the square of the distance 
from the particle. 

Field vectors, like U or F, have a physical meaning independent of 
coordinates. In a given Cartesian xı, x2, xs-coordinate system the 
vector U has components wi, uz, u3 that depend on the coordinate 
system. In a different Cartesian coordinate system the point that 
originally had coordinates x1, x2, x3 receives the coordinates yı, yı, Y3 
where the y: and xx are connected by equations of the form 


yı = G11X1 + Gi2ex2 + a13x3 + bı 
(89a) y2 = A21X1 + A22x2 + a23x3 + be 


y3 = Q31X1 + a32%2 + a33x3 + b3 


or 
(89b) y => ajexe + bj (i = 1, 2, 3). 


The components vı, V2, Us of the vector U in the new coordinate system 
are then given by the corresponding homogenenous relations 
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3 . 

(89c) Uj = pA Aj (j = 1, 2, 3). 
The matrix a = (ajx) is orthogonal, so that (see p. 158) its re- 

ciprocal is equal to its transpose. Consequently, the solutions of 

equations (89b), (89c) for xx and ux take the form 


3 

(89d) Xe = 2 ajr(y; — by) (k = 1, 2,3), 
£ 

(89e) Uk = >A QjkUj (k = 1, 2, 3). 


Any three functions wi, u2, us of the variables x1, x2, x3 determine 
a field of vectors U with components u1, u2, u3 in xı, x2, x3-coordinates. 
If the field is to have a meaning independent of the choice of coordi- 
nate systems, the components v: of U in a Cartesian yi, ye, y3-coordi- 
nate system have to be given by formula (89c) whenever the y: and 
xı are connected by formulae (89a). 


6. Gradient of a Scalar 


A scalar is a function s = s(P) of the points P in space. In any 
Cartesian coordinate system in which the point P is described by its 
coordinates xı, x2, x3 the scalar s becomes a function s = f (x1, x2, x3). 
We may regard the three partial derivatives 


Os 
ui = 0x1 = fe (x1, X2, x3), 


ðs 


0 
u3 = om = fix3(x1, X2, x3). 


X3 


as components in %1, x2, x3-coordinates of a vector U = (u1, u2, u3). 

In any new Cartesian yı, y2, ys-coordinate system connected with 
the original one by relations (89a) or (89d), the scalar s is represented 
by the function 


s = g(y1, Y2, Y3) 
3 3 3 
=f | >) ar(yk — br), Dd) akz(yk — br), Dd) ak3(yk — bx)| 
k=1 k=1 k=1 
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By the chain rule of differentiation (p. 55) we have 


<< 


Using the relations (89c), we see that the vector U has the com- 
ponents v; = ds/dy; in the yı, y2, ys-system. Thus the partial derivatives 
of the scalar s formed in any cartesian coordinate system form the 
components of a vector U that does not depend on the system. We 
call U the gradient of the scalar s and write 


U = grad s. 


By formula (14b), p. 45 the derivative of s in the direction with direc- 
tion cosines cos 01, COS Oz, Cos Q3 is given in xı, x2, x3-coordinates by 


os Os 
(90) Dias = Cos Q1 + Axe COS G2 + Jx COS Os. 


Os 
0x1 
Introducing the unit vector R = (cos ai, cos dz, cos ds) in the 


direction with direction angles ai, a2, a3, we can write the deriva- 
tive of s in that direction in vector notation as 


(90b) Dias = R- grad s. 
We find from the Cauchy-Schwarz inequality (see p. 132) for |R| = 1. 
| D «)s| S|R| | grad s| =|grad s| 


Thus, the derivative of s in any direction never exceeds the length of 
the gradient of s. Taking for R the unit vector in the direction of grad 
s, we find for the directional derivative the value 


1 
Das = Terad s] (824 s) - (grad s) = |grad s| 


Thus, the length of the gradient vector of s is equal to the maximum 
rate of change of s in any direction. The direction of the gradient is 
the one in which the scalar s increases most rapidly, while in the 
opposite direction s decreases most rapidly. 
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We shall return to the geometrical interpretation of the gradient 
in Chapter 3. We can, however, immediately give an intuitive idea 
of the direction of the gradient. Confining ourselves first to vectors 
in two dimensions, we have to consider the gradient of a scalar 
s = f(x1, x2). We shall suppose that s is represented by its level lines 
(or contour lines) 


s = f(x1, x2) = constant = c 


in the xı, x2-plane. Then the derivative of s at a point P in the direc- 
tion of the level line through P is obviously 0, for if Q is another 
point on the same level line, the equation s(Q) — s(P) = 0 holds; 
dividing by the distance p of Q and P and letting p tend to 0 we find in 
the limit (see p. 45) that the derivative of s in the direction tangential 
to the level line at P is 0. Thus, by (90b), R - grad s = 0 if Ris a unit 
vector in the direction of the tangent to the level line, and therefore, 
at every point the gradient vector of s is perpendicular to the level line 
through that point. An exactly analogous statement holds for the 
gradient in three dimensions. If we represent the scalar s by its level 
surfaces 


s = f(x1, x2, x3) = constant = c, 


the gradient has component zero in every direction tangential to the 
level surface and is therefore perpendicular to the level surface. 

In applications, we frequently meet with vector fields that repre- 
sent the gradient of a scalar function. The gravitational field of force 
due to particle of mass M concentrated in a point Q = (E1, &2, &3) may 
be taken as an example. Let F = (fı, fe, fs) denote the force exerted 
by the attractive mass M on a particle of mass m located at the 
point P = (x1, x2, x3). Denote by R the vector 


R = QÈ = (xı — £1, x2 — Éz, x3 — Es). 


By Newton’s law of gravitation, F has the direction of —R and the 
magnitude C/|R|?, where C = ymM (here y denotes the universal 
gravitational constant). Hence, 


or 
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h=C Gi ; 
V(E1 — x1)? + (E2 — x2)? + (G3 — x3)? 
By differentiation, one verifies immediately that 
0 C 
f = On; JG — mi? + Ga — ma) + Ge — aa 


(j = 1, 2, 3). 


(j = 1, 2, 3). 
Hence, 
(91) F = grad © , 


where 
r= JE mn) + é a) F E a = |R] 


is the distance of the two particles at P and Q. 

If a field of force is the gradient of a scalar function, this scalar 
function is often called the potential function of the field. We shall 
consider this concept from a more general point of view in the study 
of work and energy (pp. 657 and 714). 


c. Divergence and Curl of a Vector Field 


By differentiation we have assigned to every scalar a vector field, 
the gradient. Similarly, we can assign by differentiation to every 
vector field U a certain scalar, known as the divergence of the vector 
field U. For a specific Cartesian x1, x2, x3-coordinate system in which 
U = (u1, u2, u3), we define the divergence of the vector U as the func- 
tion 

ðu duz dus 


(92) div U = ax, + Axe + Ox 


that is, as the sum of the partial derivatives of the three com- 
ponents with respect to the corresponding coordinates. We can show 
that the scalar div U defined in this way does not depend on the 
particular choice of Cartesian coordinate system.! Let the coordinates 


1This would not be the case for other expressions formed from the first derivatives of 
the components of the vector U, for example, 

0x1 | axe 3x3 
or 

0x2 0x3 0x1 
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yı, y2, y3 of a point in a different Cartesian system be connected with 
xı, x2, x3 by equations (89b); the components vı, v2, v3 of U in the new 
system are then given by relations (89c). We have from the chain rule 
of differentiation 


ður _ $ AUK IYI 


3 OuK 3 
= a —_——- = 
> May; f 


0 
a Qjku 
j Oy; H jk k 


1 


which shows that we are led to the same scalar div U in any other 
coordinate system. 

Here we content ourselves with the formal definition of the diver- 
gence; its physical interpretation will be discussed later (Chapter 
V, Section 9). 

We shall adopt the same procedure for the so-called curl of a vector 
field U. The curl is itself a vector 


B = curl U. 


If in a x1, x2, xs-coordinate system the vector U has the components 
Ui, U2, Us, we define the components bı, be, b3 of curl U by 


0 
(93) by = SGU b2 Susy, _ Ue _ Ua 


0x2 Ox3’ 0x1 0x1  Ox2° 


We could verify as in the other cases that our definition of the curl of 
a vector U actually yields a vector independent of the particular 
coordinate system, provided the Cartesian coordinate systems con- 
sidered all have the same orientation. However, we omit these 
computations here, since in Chapter V, p. 616 we shall give a physical 
interpretation of the curl that clearly brings out its vectorial character. 

The three concepts of gradient, divergence, and curl can all be 
related to one another if we use a symbolic vector with the com- 
ponents 


o ð ð 
0x1’ 0x2’ 0x3” 


This vector differential operator is usually denoted by the symbol V, 
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pronounced “del.” The gradient of a scalar s is the product of the 
symbolic vector V with the scalar quantity s; that is, it is the vector 
3x S, Axe S, 3x3 S}. 


(94) grad s = Vs = ($ 0 ð ). 


The divergence of a vector U = (u1, u2, u3) is the scalar product 


, ð ð 
(94b) div U = V - U = 5 “1 t gy 


Finally the curl of the vector U is the vector product 


(94c) curl U =V x U 


lx O axs” axs OXI ua) 


[see (71b), p. 180. The fact that the vector Vv is independent of the Car- 
tesian coordinate system used to define its components follows from 
the chain rule of differentiation; under the coordinate transformation 
(89d), we have by the chain rule 


which shows that the components of V transform according to the 
rule (89c) for vectors. This makes it obvious that also V s, V- U and 
V x U do not depend on coordinates.? | 

In conclusion, we mention a few relations that constantly recur. 
The curl of a gradient is zero; in symbols, 


(95a) curl grad s = V x (Vs) = 0. 


1We are forced here to write the vector in front of the scalar in the product Vs, 
contrary to our usual habit, since the components of the symbolic vector y do not 
commute with ordinary scalars. 

2This statement has to be qualified in the case of the curl. Generally, magnitude and 
direction of the vector product of two vectors has a geometrical meaning, as explain- 
ed on p. 185, except that the product changes into the opposite when we change the 
orientation of the Cartesian coordinate system used. This implies for a vector U 
that curl U = V x U behaves like a vector, as long as we do not change the orien- 
tation of the coordinate system (i.e., as long as only orthogonal transformations with 
determinant +1 are used). Changing the orientation of the coordinate system 
results in changing curl U into its opposite. 
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The divergence of a curl is zero; in symbols, 
(95b) div curl U = = V-(V x U) = 0. 


As we easily see, these relations follow from the definitions of diver- 
gence, curl, and gradient, using the interchangeability of differentia- 
tions. Relations (95a, b) also follow formally if we apply the ordinary 
rules for vectors to the symbolic vector V, since then 


V x (Vs) =(V x V)s=0, V-(V x U) = det(V, V, U) = 0. 


Another extremely important combination of our vector differential 
operators is the divergence of a gradient: 
25 025 02s 


, 3 
(95c) div grad s = V+ (Vs) = a5 + aa t gx T AS: 


Here 


0? 02 0? 
(95d) A=V-+V= srt dy Axe? 
is known as the “Laplace operator” or the “Laplacian.” The partial 
differential equation 


02s 025 02s 


= Gar? + Jx + xa? = 9 


(95e) As 


satisfied by many important scalars s in mathematical physics is 
called the “Laplace equation” or “potential equation.” 

The terminology of “vector analysis” is often used also when the 
number of independent variables is other than three. A system of 
n functions wi,..., Un of n indenpendent variables x1,..., Xn 
determines a vector field in n-dimensional space. The concepts of 
gradient of a scalar and of the Laplace operator then retain their 
meaning. Notions analogous to the curl of a vector become more 
complicated. The most satisfactory approach to analogues of rela- 
tions (95a,b) in n dimensions is through the calculus of exterior 
differential forms, which will be described in the next chapter. 


d. Families of Vectors. Application to the Theory of Curves in 
Space and to Motion of Particles 


In addition to vector fields we also consider one-parametric 
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manifolds of vectors, called families of vectors, where the vectors U = 
(U1, u2, u3) do not correspond to each point of a region in space but to 
each value of a single parameter t. We write U = U (t). The derivative 
of the vector U can be defined naturally as 


dU 


(96a) a= lim > [U + h) — UO) 


It obviously has the components 


du; duz dus 


(96b) dt? dt’ dt’ 


One easily verifies that this vector differentiation satisfies analogues 
of the ordinary laws for derivatives: 


d d dye daom- A d 
(97a) gUtV= ut tuv aU) =G7Ut+GY 
d _ dV dU 
(97b) gU: V=U: at t V 
d dV dU 
(97c) gU x W=Ux G+ a x Y- 


We apply these notions to the case where the family of vectors con- 


sists of the position vectors X = X (t) = OP of the points P on a curve 
in space given in parametric representation: 


xı = $i(t), x2 = galt), x3 = ga(Z). 
Then 
X = (xı, x2, x3) = (61(t), do(t), da(t)). 


The vector dX/dt has the direction of the tangent to the curve at the 
point corresponding to t. For the vector AX = X(t + At) — X(t) 
has the direction of the line segment joining the points with parame- 
ter values t and t + At. The same holds for the vector AX/At, when 
At > 0. As At > 0 the direction of this chord approaches the di- 
rection of the tangent. If instead of t we introduce as parameter the 
length of arc s of the curve measured from a definite starting point, 
we can prove that 
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2 dX dX 


=. = 1, 


— ds ` ds 


dX 


(98) ds 


The proof follows exactly the same lines as the corresponding proof 
for plane curves (see Volume I, p. 354). Thus, dX/ds is a unit vector. 
Differentiating both sides of equation (98) with respect to s, using rule 
(97b), we obtain 


dX dX dX dX _ dX dX _ 


——n 


(99) ds ` ds | dst ds ds‘ d ~ 


This equation states that the vector 


———— 


@X (dn das, das 
= (T > ds?’ ds? 


is perpendicular to the tangent. This vector we call the curvature 
vector or principal normal vector, and its length 


1 
(100) k=5= 


we call the curvature of the curve at the corresponding point. The 
reciprocal p = 1/k of the curvature we call the radius of curvature, 
as before. The point obtained by measuring from the point on the 
curve a length p in the direction of the principal normal vector is 
called the center of curvature. 

We shall show that this definition of curvature agrees with the one 
given for plane curves in Volume I (p. 354). For each s the vector 
Y = dX/ds is of length 1 and has the direction of the tangent. If we 
think of the vectors Y(s + As) and Y(s) as having the origin as 
common initial point, then the difference AY = Y(s + As) — Y¥(s) 
is represented by the vector joining the end points. The angle B 
between the tangents to the curve at the points with parameters s 
and s + As is equal to the angle between the vectors Y(s) and 
Y(s + As). Then 


JAY|=|¥(s + As) — ¥(s)|= 2 sinb, 


since 


| ¥(s)| =| ¥(s + As)| = 1. 
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Using 
2 sin P/2 “p B2 1 for 6-0, 
we find that 
Se -|2Y] -tim [4Y] = tim & 
ds? 7 ds Aso As mae, As 


Hence, k is the limit of the ratio of the angle between the tangents at 
two points of the curve and the length of arc between those points as 
the points approach each other. But this limit defines curvature for 
plane curves.! 


The curvature vector plays an important part in mechanics. We 
suppose that a particle moving along a curve has the position vector 
X(t) at the time t. The velocity of the motion is then given both in 
magnitude and direction by the vector dX/dt. Similarly, the ac- 
celeration is given by the vector d?X/dt?. By the chain rule, we have 


dX _ ds dX 
dt dt ds 
and 
PX _ d’s dX , |ds)? PX 
(101) dt? dt? ds + S ds? ` 


In view of what we know already about the first and second deriva- 
tives of the vector X with respect to s, equation (101) expresses the 
following facts: the acceleration vector of the motion is the sum of 
two vectors. One of these is directed along the tangent to the curve 
and its length is equal to d?s/dt?, that is, to the acceleration of the 
point in its path (the rate of change of speed or tangential accelera- 
tion). The other is directed normal to the path toward the center of 
curvature, and its length is equal to the square of the speed multiplied 
by the curvature (the normal acceleration). For a particle of unit mass 


1In the case of space curves, we cannot, as for plane curves, identify 6B with the 
increment Aa of an angle of inclination a. The reason is that the angle between 
Y (s) and Y (s + As) is generally not equal to the difference of the angles the vectors 
Y (s) and Y (s + As) form with some fixed third direction. Angles between directions 
in space are not additive, as in the plane. 
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the acceleration vector is equal to the force acting on the particle. If 
no force acts in the direction of the curve (as is the case for a particle 
constrained to move along a curve subject only to the reaction forces 
acting normal to the curve), the tangential acceleration vanishes and 
the total acceleration is normal to the curve and of magnitude equal 
to the square of the velocity multiplied by the curvature. 


10. 


Exercies 2.5 


—_—» 
. Verify that the position vector PQ of a point Q with respect to a point P 


behaves like a vector in a change of coordinates. 


. Derive the following identities. 
(a) grad («ß) = « grad B + B grad « 
(b) div («U) = U e grad « + «a div U 


(c) curl («U) = grad « x U + «curl U 
(d) div (U x V) = V » curl U — U » curl V. 


. Let U + v be the symbol for the operator 


ð 0 0 
Uz 3x + Uszy + Uz 3z- 
Show that 


(a) grad (U . V) = Ue•.vV + V•vU + Ux curl V+ V xcurl U 
(b) curl (U x V) = U div V — V div U + V evU — U « yV. 


. For the Laplacian operator A establish 


AU = grad div U — curl curl U 


. Find the equation of the so-called osculating plane of a curve x = f(t), 


y = g(t), z = h(t) at the point to, that is, the limit of the planes passing 
through three points of the curve as these points approach the point 
with parameter fo. 


. Show that the curvature vector and the tangent vector both lie in the 


osculating plane. 


. Let C be a smooth curve with a continuously turning tangent. Let d 


denote the shortest distance between two points on the curve and l the 
length of arc between the two points. Prove that d — l = o(d) when d 
is small. 


. Prove that the curvature of the curve X = X(t), ¢ being an arbitrary 


parameter, is given by 
p 1 X= X + XNA 
7 |X|? 


. If X = X(s) is any parametric representation of a curve, then the vector 


d?X/dt? with initial point X lies in the osculating plane at X. 

If Cis a continuously differentiable closed curve and A a point not on 
C, there 1s a point B on C that has a shorter distance from A than any 
other point on C. Prove that the line AB is normal to the curve. 
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11. 


12. 


13. 


14. 


15. 


16. 


17. 


A curve is drawn on the cylinder x? + y? = a? such that the angle 
between the z-axis and the tangent at any point P of the curve is equal 
to the angle between the y-axis and the tangent plane at P to the 
cylinder. Prove that the coordinates of any point P of the curve can be 
expressed in terms of a parameter 0 by the equations 


x =a cos ð, y=asin9, z=c+a log sin 9, 
and that the curvature of the curve is (1/a) sin 9 (1 + sin? 6)!/, 
Find the equation of the osculating plane (cf. Exercise 5) at the point 
9 of the curve x = cos 9, y = sin 0, z=/f(6). Show that if f(0) = 
(cosh A9)/A, each osculating plane touches a sphere whose center is 
the origin and whose radius is v(1 + 1/43). 


(a) Prove that the equation of the plane passing through the three 
points tı, t2, ts on the curve 


1 1 
— —7f3 — — h2 — 
x= gat, y= 5 Ot, z=ct 
is 
3 — 
= — 2 (tı + te + ts) z + (tats + tati + tite) Z titets = 0. 


(b) Show that the point of intersection of the osculating planes at tı, 
te, ts lies in this plane. 


Let X = X(s) be an arbitrary curve in space, such that the vector X(s) 
is three times continuously differentiable (s is the length of arc). Find 
the center of the sphere of closest contact with the curve at the point s. 


If X = X(s) is a curve on a sphere of unit radius where s is arclength, 
then 


|X|? — |X|4= |X|? — Š XK)? = (K- [Š x J). 
holds. 
The limit of the ratio of the angle between the osculating planes at two 
neighboring points of a curve and of the length of arc between these two 
points (i.e., the derivative of the unit normal vector with respect to the 
arc s) is called the torsion of the curve. Let &1 (s), &2(s) denote the unit 
vector along the tangent and the curvature vector of the curve X(s); 
by &3(s) we mean the unit vector orthogonal to &1 and &2 (the so-called 
binormal vector), which is given by [&1 X &e]. 
Prove Frenet’s formulae 


h= a, 
p61 És 
s =— +t t? 
Es = i 


where 1/p = k is the curvature and 1/7 the torsion of x(s). 

Using the vectors &1, &, &s of Exercise 16 as coordinate vectors, find 
expressions for (a) the vector X, (b) the vector from the point X to the 
center of the sphere of closest contact at X. 
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18. Show that a curve of zero torsion is a plane curve. 


19. 


20. 


21. 


Consider a fixed point A in space and a variable point P whose motion 
is given as a function of the time. Denoting by P the velocity vector of 
P and by a a unit vector in the direction from P to A, show that 


£ \PA|=—a-P 


(a) Let A, B, C be three fixed noncollinear points and let P be a moving 
point. Let a, b, c be unit vectors in the directions PA, PB, PC, 
respectively; express the velocity vector P as a linear combination 
of these vectors: 


P = au + bv + cw. 
Prove that 


à = gpp lle bw + (a + ojula — vb — we}. 


(b) Prove that the acceleration vector P of the point P is 


P = «a + Bb + ye, 
where 


-ú+ h o+ vl aS -e 
t= UY UWA P| |B-P|| ““\\|A—P| |C—P| 
with similar expressions for ß and y. 


Prove that if z = u(x, y) represents the surface formed by the tangents 
of an arbitrary curve, then (a) every osculating plane of the curve is a 
tangent plane to the surface and (b) u(x, y) satisfies the equation 


UxrrUyy — Ury? = 


CHAPTER 
3 


Developments and Applications 
of the Differential Calculus 


3.1 Implicit Functions 


a. General Remarks 


Frequently in analytical geometry the equation of a curve is given 
not in the form y = f(x) but in the form F(x, y) = 0. A straight line 
may be represented in this way by the equation ax + by + c = 0, 
and an ellipse, by the equation x?/a? + y?/b? = 1. To obtain the equa- 
tion of the curve in the form y = f(x) we must “solve” the equation 
F(x, y) = 0 for y. In Volume I we considered the special problem of 
finding the inverse of a function y = f(x), that is, the problem of 
solving the equation F(x, y) = y — f(x) = 0 for the variable x. 

These examples suggest the importance of methods for solving an 
equation F(x, y) = 0 for x or for y. We shall find such methods even 
for equations involving functions of more than two variables. 

In the simplest cases, such as the foregoing equations for the 
straight line and ellipse, the solution can readily be found in terms 
of elementary functions. In other cases, the solution can be approxi- 
mated as closely as we desire. For many purposes, however, it is pref- 
erable not to work with the solved form of the equation or with these 
approximations but instead to draw conclusions about the solution by 
directly studying the function F(x, y), in which neither of the varia- 
bles x, y is given preference over the other. 

Not every equation F(x, y) = 0 is the implicit representation 
of a function y = f(x) or x = ġ(y). It is easy to give examples of 
equations F(x, y) = 0 that permit no solution in terms of functions 
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of one variable. Thus, the equation x2 + y2 = 0 is satisfied by the 
single pair of values x = 0, y = 0 only, while the equation x? + y2? + 
1 = 0 is satisfied by no real values at all. It is therefore necessary to 
investigate more closely the circumstances under which an equation 
F(x, y) = 0 defines a function y = f(x) and the properties of this 
function. 


Exercises 3.la 


1. Suppose that for some pair of values (a, b), f(a, b) = 0. If a is known, give 
a constructive iterative method for finding 6. Under what conditions 
on f will this method work? 


b. Geometrical Interpretation 


To clarify the situation we represent the function F(x, y) by the 
surface z = F(x, y) in three-dimensional space. The solutions of 
the equation F(x, y) = 0 are the same as the simultaneous solutions 
of the two equations z = F(x, y) and z = 0. Geometrically, our prob- 
lem is to find whether the surface z = F(x, y) intersects the x, y- 
plane in curves y = f(x) or x = d(y). (How far such a curve of 
intersection may extend does not concern us here.) 

A first possibility is that the surface and the plane have no point 
in common. For example the paraboloid z = F(x, y) = x2. + y2+1 
lies entirely above the x, y-plane. Here there is no curve of inter- 
section. Obviously, we need consider only cases in which there is at 
least one point (xo, yo) at which F(xo, yo) = 0; the point (xo, yo) con- 
stitutes an “initial point” for our solution. 

Knowing an initial solution, we have two possibilities: either the 
tangent plane at the point (xo, yo) is horizontal or it is not. If the 
tangent plane is horizontal, we can readily show by means of ex- 
amples that it may be impossible to extend a solution y = f(x) or 
x = @(y) from (xo, yo). For example, the paraboloid z =x? + y2 has the 
initial solution x = 0, y = 0, but contains no other point in the 
x, y-plane. In contrast, the surface z = xy with the initial solution 
x = 0, y = 0 intersects the x, y-plane along the lines x = 0 and y = 0; 
but in no neighborhood of the origin can we represent the whole 
intersection by a function y = f(x) or by a function x = ¢(y), (see 
Figs. 3.1 and 3.2). On the other hand, it is quite possible for the 
equation F(x, y) = 0 to have such a solution even when the tangent 
plane at the initial solution is horizontal, as in the case F(x, y) = 
(y — x)* = 0. In the exceptional case of a horizontal tangent plane, 
therefore, no definite general statement can be made. 
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Figure 3.1 The surface u = xy. 


Figure 3.2 Contour lines of u = xy. 


The remaining possiblity is that the tangent plane at the initial 
solution is not horizontal. Then, thinking intuitively of the surface 
z = F(x, y) as approximated by the tangent plane in a neighborhood 
of the initial solution, we may expect that the surface cannot bend 
fast enough to avoid cutting the x,y-plane near (xo, yo) in a single 
well-defined curve of intersection and that a portion of the curve 
near the initial solution can be represented by the equation y = f(x) 
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or x = ¢(y). Analytically, the statement that the tangent plane is 
not horizontal means that Fz(xo, yo) and Fy,(xo, yo) are not both 
zero (see p. 47). This is the basis for the discussion in the next sub- 
section. 


Exercises 3.1b 


1. By examining the surface of z = f(x, y), determine whether the equation 
f(x, y) = 0 can be solved for y as a function of x in a neighborhood of the 
indicated point (xo, yo) for 


(a) f(x,y) = x? — y?, xo=yo=0 

(b) f(x, y) = [log (x + y)]!?, xo = 1.5, yo = —.5 
(c) f(x, y) = sin [r (x + y) — 1, xo = yo = 1/4 
(d) fx y) =x? +y? — y, xo= yo = 0. 


c. The Implicit Function Theorem 


We now state sufficient conditions for the existence of implicit 
functions and at the same time give a rule for differentiating them: 

Let F(x, y) have continuous derivatives Fz and Fy ina neighborhood 
of a point (xo, yo), where 


(1) F(xo, yo) = 0, Fy(xo, yo) ~ 9. 
Then centered at the point (xo, yo), there is some rectangle 
(2) xo—aSsxrxsSxo ta, yo— P SYySYy +P 


such that for every x in the interval I given by xo— a <x <xo+a 
the equation F(x, y) = 0 has exactly one solution y = f(x) lying in 
the interval yo — B < y < yo + B. This function f satisfies the initial 
condition yo = f(xo) and, for every x in I, 


(3) F(x, f(x)) = 0. 
(3a) yo — P < f(x) S yo +P 
(3b) F(x, f(x)) 4 0. 


Furthermore, f is continuous and has a continuous derivative in I, given 
by the equation 


(4) y=- 
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This is a strictly local existence theorem for solutions of the 
equation F(x, y) = 0 in the neighborhood of an initial solution 
(xo, yo). It does not indicate how to find such an initial solution or 
how to decide if the equation F(x, y) = 0 is satisfied for any (x, y) 
at all. These are global questions and beyond the scope of the theorem. 
Uniqueness and regularity of the solution y = f(x), also, can be 
guaranteed only locally, that is, when y is restricted to the interval 
yo— B < y < yo + B. The need for such restrictions is evident from 
the simple example of the equation 


F(x, y) = x? + y? —-1=0. 


For every x with—1 < x < 1 the equation has two different solutions 
y= + V1 — x2. A single-valued solution y = f(x) is obtained by pre- 
scribing arbitrarily one of the signs at each x. It is clear that in this 
way we can find solutions that are discontinuous for every x, 
choosing, for example, the positive sign for rational x and the nega- 
tive one for irrational x. Continuous solutions y = f(x) are obtained 
if we restrict y to a constant sign. This sign can be fixed by choosing 
for a given xo in —1 < xo < 1l one of the two possible values yo for which 
xo? + yo? = 1. A unique continuous solution y = f(x) with yo = f(xo) 
is obtained then for all x in —1 < x < 1 by requiring y to satisfy x? + 
y2 = 1 and to have the same sign as yo. Geometrically, the graph of 
f is either the upper or the lower semicircle, whichever contains the 
point (xo, yo). The function f has a continuous derivative 


phe kL A 
eS Fy yf) 
for — 1< x <1. With y defined to be zero for x = + 1, the solution 
y = f(x) will be continuous in the closed interval — 1 < x < 1. How- 
ever, the derivative y’ then becomes infinite at the end points of the 
interval, since Fy = 0 there. 

We shall prove the general theorem in the next section. We observe 
here only that once the existence and the differentiability of the 
function f(x) satisfying (8) have been established, we can find an 
explicit expression for f'(x) by applying the chain rule [see (18) p. 55] 
to differentiate F(x, y). This yields 


Fz + Fyf"(x) = 0, 


and leads to formula (4) as long as Fy + 0. Equivalently, if the equa- 
tion F(x, y) = 0 determines y as a function of x, we conclude that 
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and, hence, that 


dy = dx = — Feds. 


An implicit function y = f(x) can be differentiated to any given 
order, provided the function F(x, y) possesses continuous partial deriv- 
atives of that same order. For example, if F(x, y) has continuous 
first and second derivatives in the rectangle (2), the right side of equa- 
tion (4) is a compound function of x: 


_ F(x, f(x) 
Fy(x, f(x)” 


Since, by (3b), the denominator does not vanish and since f(x) already 
is known to have a continuous first derivative, we conclude from (4) 
that y’ has a continuous derivative; by the chain rule y” is given by 


mW FyFac + FyP af’ — FoF sy — PoP yf" 
y= F, ` 


Substituting the expression (4) for f’, we find that 


Mn F, Fae — 2F;FyFzy + Fr Eyy 


The rules (4) and (5) for finding the derivatives of an implicit func- 
tion y = f(x) can be used whenever the existence of f in an interval has 
been established from the general theorem on implicit functions, even 
in cases where it is impossible to express y explicitly in terms of ele- 
mentary functions (rational functions, trigonometric functions, etc.). 
Even if we can solve the equation F(x,y) = 0 explicitly for y, it is usu- 
ally easier to find the derivatives of y from the formulae (4) and (5), 
without making use of any explicit representation of y = f(x). 


Examples 


1. The equation of the lemniscate (Volume I, p. 102) 
F(x, y) = (x? + y?)? — 2a%(x? — y?) = 0 


is not easily solved for y. For x = 0, y = 0 we obtain F = 0, Fz = 0, 
F, = 0. Here our theorem fails, as might be expected from the fact that 
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two different branches of the lemniscate pass through the origin. How- 
ever, at all points of the curve where y + 0, our rule applies, and the 
derivative of the function y = f(x) is given by 


»_ _Fr_ _ 4x(x + y?) — 4a?x 
y= Fy — Ay(x2 + y?) + 4a2y’ 


We can obtain important information about the curve from this equa- 
tion, without using the explicit expression for y. For example, maxima 
or minima might occur where y’ = 0, that is, for x = 0 or for x2 + y2 = 
a?. From the equation of the lemniscate, y = 0 when x = 0; but at the 
origin there is no extreme value (cf. Fig. 1.8.3, Volume I, p. 103). The 


two equations therefore give the four points | + 5 V3, + | as the 


maxima and minima. 
2. The folium of Descartes has the equation 


F(x, y) = x? + y’? — 8axy = 0 


(cf. Fig 3.3), with awkward explicit solutions. At the origin, where 
the curve intersects itself, our rule again fails, since at that point 
F = F; = F,=0. For all points at which y? = ax we have 


»_ Fa _ xt ay 
y= Fy y?—ax’ 


Accordingly, there is a zero of the derivative when x? — ay = Oor, if 
we use the equation of the curve, when 


Figure 3.3 Folium of Descartes. 
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x=ayY2, y=a94. 


Exercises 3.lc 


1. Prove that the following equations have unique solutions for y near the 
points indicated: 

(a) P@+xyt+ty=7 (2,1) 
(b) x cos xy = 0 (1, 7/2) 
(c) xy+logxy=1 (1,1) 
(d) x5 + y5 +xy=3 (1, )). 

2. Find the first derivatives of the solutions in Exercise 1 and give their 
values at the indicated points. 

3. Find the second derivatives of the solutions in Exercise 1 and give their 
values at the indicated points. 

4. Which of the implicitly defined functions of Exercise 1 are convex at 
the indicated points. 

5. Find the maximum and minimum values of the function y that satisfies 
the equation x? + xy + y? = 27. 

6. Let fy(x, y) be continuous on a neighborhood of the point (xo, yo). Show 
that the equation 


y=yo+ M fE, y)d& 


determines y as a function of x in some interval about x = xo. 


d. Proof of the Implicit Function Theorem 


Existence of the implicit function follows directly from the inter- 
mediate value theorem (see Volume I, p. 44). Assume that F(x, y) is 
defined and has continuous first derivatives in a neighborhood of the 
point (xo, yo), and let 


F (xo, Yo) = 0, F(xo, yo) # 0. 


Without loss of generality we assume that m = F;,(xo, yo) > 0. Other- 
wise, we merely replace the function F by — F, which leaves the points 
described by the equation F(x, y) = 0 unaltered. Since F(x, y) is con- 
tinuous, we can find a rectangle R with center (xo, yo) and so small 
that R lies completely in the domain of F and F(x, y) > m/2 through- 
out R. Let R be the rectangle 


xo-asxxSxuta, y—BsysSyrt+f 


(see Fig. 3.4). Since F(x, y) also is continuous, we conclude that Fz 
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O | Xo a | x | Xota 
Xora Xo Xota 
Figure 3.4 


is bounded in R. Thus, there exist positive constants m, M such that 
6 Fass >z> Fæ, SM for (x,y)inR. 


For any fixed x between xo — a and xo + a the expression F(x, y) is 
a continuous and monotonically increasing function of y for yo — B 
<S y S yo + B. If 


(7) F(x, yo+ B)>0, F(x, yo — 8) <0, 


we can be sure that there exists a single value y intermediate between 
yo — B and yo + B at which F(x, y) vanishes. For the given x the 
equation F(x, y) will then have a single solution y = f(x) for which 


yo-B<y<yotB. 
To prove (7), we observe that by the mean value theorem 
F(x, yo) = F(x, yo) = F(xo, yo) = Fz(6, yo)(x — Xo)., 


where & is intermediate between xo and x. Hence, if a denotes a number 
between 0 and a, we have 


| F(x, yo)| S| F26, yo)| |x — xo] S Ma for |x—xol|Sa. 


Similarly, it follows from Fy > m/2 that 


F(x, yo + B) = LF (ee, yo + B) — F(x, yo)] + Flee, yo) > 5 mB — Ma, 


F(x, yo — B) = — (F(x, yo) — Flee, yo — BI + F(a, yo) < — 5 mB + Mo. 


Thus, the inequalities (7) hold for any x in the interval xo — a < x < 
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xo + a provided we take a so small that a < a and a < mf/2M. 

For any x with |x — xo|< a this proves existence and uniqueness 
of a solution y = f(x) of the equation F(x, y) = 0 such that |y — yo| S 
B and F(x, y) > m/2 > 0. For x = xo the equation F(x, y) = 0 has the 
solution y = yo corresponding to our initial point. Since yo certainly 
lies between yo — B and yo — ß, we see that f(xo) = yo. Continuity and 
differentiability of f(x) now follow from the mean value theorem for 
functions of several variables applied to F(x, y) [see (33) p. 67]. Let x 
and x + h be two values between xo — a and xo + a. Let y = f(x) and 
y + k = f(x + h) be the corresponding values of f where y andy + k 
lie between yo — B and yo + B. Then F(x, y) = 0, F(x + h, y + k) = 0. 
It follows that 


0= F(x + h, y +k) — F(x, y) 
= F(x + 8h, y + Ok) h + F(x + Oh, y + OR)R, 


where 8 is a suitable intermediate value between 0 and 1.1 
Using F, ~ 0, we can divide by Fy and find that 


6) k _ F(x + Oh, y + OR) 
h — F(x + 89h, y + 9k) ° 


Since |Fz| < M, |F,|> m/2 for all points of our rectangle, we find 
that the right-hand side is bounded by 2M/m. Thus 


kis Al. 


Hence, k = f(x + h) — f(x) > 0 for h > 0, which shows that y = f(x) 
is a continuous function. We conclude from (8) that for fixed x and 


for y = f(x), 


mE + 9- f(x) — _ lim Fi{x + Oh, y + Ok) _ F(x,y) 
lim heo Falx + Oh,y+ 0k) = F(x, y) 


This establishes the differentiability of f and at the same time yields 
formula (4) for the derivative. 

The proof hinges on the assumption F,(xo, yo) + 0, from which we 
could conclude that Fy is of constant sign in a sufficiently small 


1Observe that the mean value theorem can be applied here, since the segment 
joining any two points of the rectangle |x — xo|Sa, |y— yo/S B lies wholly 
within the rectangle. 
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neighborhood of (xo, yo) and that F(x, y) for fixed x is a monotone 
function of y. | 

The proof merely tells us that the function y = f(x) exists. It is a 
typical example of a pure‘‘existence theorem,” in which the practical 
possibility of calculating the solution is not considered. Of course, 
we could apply any of the numerical methods discussed in Volume I 
(pp. 494 ff.) to approximate the solution y of the equation F(x, y) = 0 
for fixed x. 


Exercises 3.1d 


1. Give an example of a function f(x, y) such that (a) f(x, y) = 0 can be 
solved for y as a function of x near x = xo, y = yo, and (b) fy(xo, yo) = 0. 

2. Give an example of an equation F(x, y) = 0 that can be solved for y as a 
function y = f(x) near a point (xo, yo), such that f is not differentiable at 
xo. 

3. Let ¢(x) be defined for all real values of x. Show that the equation 
F(x, y) = y3 — y2? + (1 + x?) y — 6(x) = 0 defines a unique value of y 
for each value of x. 


e. The Implicit Function Theorem for More Than Two 
Independent Variables 


The implicit function theorem can be extended to a function of 
several independent variables as follows: 


Let F(x, y,.. ., z, u) be a continuous function of the independent 
variables x,y,.. . z, u, with continuous partial derivatives Fz, Fy, . . ., 
Fz, Fy. Let (xo, yo,. . . , Zo, uo) be an interior point of the domain of 


definition of F, for which 
F (xo, yo, . . ., Zo, Uo) = O and Fau(xo, Yo, . . ., Zo, Uo) Æ O. 


Then we can mark off an interval uo — B S u < uo + B about uo and a 
rectangular region R containing (xo, yo, . . ., zo) in its interior such that 
for every (x,y, . . ., z) in R, the equation F(x,y, . . ., z, u) = Ois satisfied 
by exactly one value of u in the interval uo — B < u <S uo + B.! For 
this value of u, which we denote by u = f(x, y;. . ., z), the equation 


F(x,y, . . ., z, f(%,y,...,2)) =0 
holds identically in R; in addition, 
1The value B and the rectangular region R are not determined uniquely. The as- 


sertion of the theorem is valid if B is any sufficiently small positive number and if 
we choose R (depending on §) sufficiently small. 
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uo = f(xo, YO, + + «9 zo), 
uo — B < f(x, Y, . . Z) < Uo + B; Fulx, y, . . .,2, f(x,y... -,2)) #0. 


The function f is a continuous function of the independent variables x, 
y,... .,2, and possesses continuous partial derivatives given by the 
equations 


(9a) F; + Fufz = 0, Fy + Fufy = 0,. . ., Fz + Fufz = 0. 


The proof follows exactly the same lines that were given in the pre- 
vious section for the solution of the equation F(x, u) = 0 and offers 
no further difficulty. 

It is suggestive to combine the differentiation formulae (9a) in the 
single equation 


(9b) F,dx+ Fydy+++++F,dz+ Fy du = 0. 
In words, if the variables x,y, . . ., z, u, are not independent of one 
another but are subject to the condition F(x, y, . . ., Z,u) = 0, then the 


linear parts of the increments of these variables are likewise not inde- 
pendent but are connected by the linear equation 


dF = F; dx + Fy dy + » - - + Fzdz + Fy du = 0. 


If we replace du in (9b) by the expression uzdx + uydy + ---> 
+ uzdz and then equate the coefficient of each of the mutually independ- 
ent differentials dx, dy, . . ., dz to zero, we retrieve the differentiation 
formulae (9a). 

Incidentally, the concept of implicit function enables us to give a 
general definition of an algebraic function. We say that u = f(x, y, 
. . . )isanalgebraic function of the independent variables x, y,. . .if 
u can be defined implicitly by an equation F(x, y,. . . . u) = 0, where 
F isa polynomial in the arguments x,y, . . ., u; briefly, if u “satisfies 
an algebraic equation.” A function that satisfies no algebraic equa- 
tion is called transcendental. 

As an example, we apply our differentiation formulae to the 
equation of the sphere, 


F(x, y, u) = x? + y? + u? — 1 =0. 


For the partial derivatives, we obtain 
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Ure = — © + us = -A 

me u u” ue” 
x xy 

Ury = — Uy = — * 

TY u2 y u3’ 

uy = -1+ Z u= -VA 

yy u u2 y u3 


Exercises 3.le 


1. Show that the equation x + y + z = sin xyz can be solved for z near 
(0, 0, 0). Find the partial derivatives of the solution. 


2. For each of the following equations examine whether it has a unique 
solution for z as a function of the remaining variables near the indi- 
cated point: 


(a) sin x + cos y + tan z= 0 (c= 0, y=5,2=7) 
(b) x? + 2y2 + 322 —-w=0 (x = 1, y = 2, z = —1, w= 8) 
(c) 1 + x + y = cosh (x + z) + sinh (y + 2) (x = y =z = 0). 


3. Show that x + y + z + xyz? = 0 defines z implicitly as a function of x 
and y in a neighborhood of (0, 0, 0). Expand z to fourth order in powers 


of x and y. 
3.2 Curves and Surfaces in Implicit Form 


a. Plane Curves in Implicit Form 


The description of a plane curve by an equation of the form y = f(x) 
gives asymmetric preference to one of the coordinates. The tangent 
and the normal to the curve were found (see Volume I, pp. 344-345) 
to be given by the respective equations 


(10a) (n —y) — & — xf") =0 
and 
(10b) (n — y)f'(x) + § — x) = 0, 


where &, ņ are the “running coordinates” of an arbitrary point on the 
tangent or normal, and x, y are the coordinates of the point on the 
curve. The curvature of the curve is 
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1I 


(10c) k = + fan 
(see Volume I p. 357). For a point of inflection the condition 
(10d) f(x) = 0 


holds. We shall now obtain the corresponding symmetrical formulae 
for curves represented implicitly by an equation of the type F(x, y) = 0. 
We do this under the assumption that at the point in question Fz 
and Fy are not both 0, so that 


(11) Fz + Fy? 4 0. 


If we suppose that Fy 4 0, say, we can substitute for f'(x) in (10a, 
b), its value from (4), p. 221, and at once obtain the equation of the 
tangent in the form 


(12a) (6 — x)Fr + (n — y) Py = 0 
and that of the normal in the form 
(12b) (§ — x)Fy — (n — y)Fz = 0. 


For Fy = 0, Fz 4 0 we obtain the same equations by starting from the 
solution of the implicit equation F(x, y) = 0 in the form x = g(y). 

The direction cosines of the normal to the curve at the point (x, y)— 
that is, the direction cosines of the normal to the line with equation 
(12a) in the E, n-plane—are given by 


F; F, 


CoS a = VF, + Fi?’ sın 1 = JF + F; 


(12c) 
[see (20), p. 135] Similarly, the direction cosines of the tangent to the 
curve—that is, of the normal to the line (12b)—are 


(12d) cos B = VF2 4 Fp = B= UESIA 

There are actually two directions normal to the curve at a given 
point, the one with direction cosines (12c) and the opposite one. The 
normal given by (12c) has the same direction as the vector with com- 
ponents Fz, Fy, the gradient of F (see p. 205). We saw on p. 206 that the 
direction of the gradient vector is the one in which F increases fastest; 
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thus, at a point of the curve F(x, y) = 0 the gradient points into the re- 
gion F > 0 and the same holds for the normal direction determined by 
the formulae (12c). 

Formula (5), p. 228 gave the expression for the second derivative y” = 
f''(x) of a function given in explicit form F(x, y) = 0. It follows 
that the necessary condition f” = 0 for the occurrence of a point of 
inflection can be written as 


(13) F, Fee — 2F;F}Fzy + Fr Fyy = 0 


for curves given implicitly. In this formula there is no preference for 
either of the two variables x, y. It is completely symmetric and no 
longer requires the assumption that F, ~ 0. This symmetric charac- 
ter reflects, of course, the fact that the notion of point of inflection has 
a geometrical meaning quite independent of any coordinate system. 

If we substitute formula (5) for f” (x) into the formula (10c) for the 
curvature k of the curve, we again obtain an expression! symmetric in 
x and y, 


— F? Fez — 2F FP yF ry + F,2F yy 


(14a) k (F,2 + F,2)3?2 


Introducing the radius of curvature 


1 
(14b) P=}. 


we find for the coordinates £, n of the center of curvature, the point on 
the inner normal at distance p from (x, y) (see Volume I, p. 358), 


Fz Fy 
Uo) =r- eS P TIP JP Fe 


If instead of the curve F(x, y) = 0, we consider the curve 
F(x, y) =, 


where c is a constant, everything in the preceding discussions remains 
the same. We only have to replace the function F(x, y) by F(x, y) — c, 
which has the same derivatives as the original function. Thus, for 


1For the sign of the curvature, see Volume I, p. 357. The curvature k defined by 
formula (14a) is positive if F increases on the “outer” side of the curve, that is, if the 
tangent to the curve near the point of contact lies in the region F = 0. 
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these curves, the form of the equations of the tangent, normal, and 
so on are exactly the same as above. 

The class of all curves F(x, y) — c = 0 that we obtain when we 
allow c to range through all the values of an interval forms the family 
of “contour lines,” or “level lines,” of the function F(x, y); (see p. 
14). More generally, we obtain a one -parameter family of curves from 
an equation of the form 


F(x, y, c) = 0, 
which for each constant value of the parameter c yields a curve I, 
in implicit form. For a point (x, y) lying on the curve T. —that is, sat- 
isfying the equation F(x, y, c) = 0—all the formulae derived pre- 
viously apply. In particular, the gradient vector (F(x, y, c), F(x, y, c)) 


is normal to Te at the point (x, y). 
As an example, we consider the ellipse 


(15a) F(x, y) = 3 4 v= = 1. 


By (12a) the equation of the tangent at the point (x, y) is 


(& — x)* z + (n- yjz = 


hence, from (15a), 


T 


Sx AY _ 
a? p? 


A 


=]. 


We find from (14a) that the curvature is 


a*b4 
(15b) ~ (UE F D 
If a > b, this has its greatest value a/b? at the vertices y = 0, x = +a. 
Its least value b/a? occurs at the other vertices x = 0, y = +b. 

If two curves F(x, y) = 0 and G(x, y) = 0 intersect at the point (x, y) 
the angle between the curves is defined as the angle œ formed by 
their tangents (or normals) at the point of intersection. If we recall 
that the gradients give the direction of the normals and apply formula 
(7), p. 128 for the angle between two vectors, we find that 
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F,Gz + FyGy 


Oo) cos © = YF + F VGE + Gy 


Here cos œ is determined uniquely by the choice of œ as angle be- 
tween the normals of the two curves in the directions of increasing 
F and G. 

Putting œ = 1/2 in (16), we obtain the condition for orthogonality, 
that is, for the curves to intersect at right angles at the point (x, y): 


(16a) F Gz + FyGy = 0. 

If the curves touch—that is, have a common tangent and normal in the 
point where they meet—their gradient vectors (Fz, Fy) and (Gz, Gy) 
must be parallel. This leads to the condition 


(16b) F,Gy — FyGz = 0. 


As an example, we consider the family of parabolas 
(17a) F(x, y, c) = y? — 2e(x + s) = 0 


(see Fig. 3.9, p. 245), all of which have the origin as focus (“confocal 
parabolas”). If cı > 0 and c2 < 0, the two parabolas 


F(x, y,c1) = y? — 2ex(x + z) =0 


and 


F(x, y, c2) = y? — 2ea(x + a = 0 


intersect each other perpendicularly at two points; for at the points of 
intersection 


x= 5 (c1 + c2), y? = — C1C2, 


and hence, 


F(x, y, c1) Fa(x, y, c2) + Fy(x, y, c1) Fy(x, y, c2) 
= 4(cı c2 + y?) = 0. 
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By (14a) the curvature of the parabola (17a) is given by 


c2 
T (e + y2)3/2° 


At the vertex x = —c/2, y = 0, this reduces to 


t 


k = . 
Ka 


The center of curvature or center of the osculating circle at the vertex 
has then by (14c) the coordinates 


, n=0 


NIO 


E = - 5 +lelsgn c = 


so that the focus (0, 0) lies halfway between the vertex and the center 
of curvature. 


Exercises 3.2a 


1. Find the equations of the tangent and normal for the curves given 
implicitly by the following relations: 
(a) x2 + 2y? — xy = 0 
(b) e? sin y + e” cos x = 1 
(c) cosh (x + 1) — siny = 0 
(d) x? + y? =y + sin x 
(e) x3 + y4 = cosh y 
(€) x¥+y7=1. 
2. Calculate the curvature of the curve 
sin x + cos y = 1 
at the origin. 
3. Find the curvature of a curve that is given in polar coordinates by the 
equation f(r, 9) = 0. 
4, Prove that the intersections of the curve 
(x + y — a) + 2Taxy = 0 
with the line x + y = a are inflections of the curve. 
5. Determine a and b so that the conics 
4x? + 4xy + y? — 10x — 10y + 11 = 0 
(y + bx — 1 — b} — a(by— x +1 — b)=0 
cut one another orthogonally at the point (1,1) and have the same 
curvature at this point. 
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6. Let K’ and K” be two circles having two points A and B in common. If 
a circle K is orthogonal to K’ and K”, then it is also orthogonal to every 
circle passing through A and B. 


b. Singular Points of Curves 


In many of the formulae of the preceding section the expression 
F; + F} occurs in the denominator. Accordingly, we may expect 
something unusual to happen when this quantity vanishes, that is, 
when Fz = 0 and Fy = 0 at a point of the curve F(x,y) = 0. Atsuch a 
point the expression y’ = — Fz/Fy for the slope of the tangent loses its 
meaning. 

We call a point P of a curve regular if in a neighborhood of P either 
variable x or y can be represented as a continuously differentiable 
function of the other. In that case, the curve has a tangent at P and is 
closely approximated by that tangent in a neighborhood of P. If not 
regular, a point of the curve is called singular or a singularity. 

From the implicit function theorem we know that if F(x, y) has con- 
tinuous first partial derivatives, then a point of the curve F(x, y) = 0 
is regular if at that point Fz? + F, 4 0, for if Fy 40 at P, we can 
solve the equation F(x, y) = 0 and obtain a unique continuously 
differentiable solution y = f(x). Similarly, if Fz = 0 we can solve the 
equation for x. 

An important type of singularity is a multiple point, that 1s, a point 
through which two or more branches of the curve pass. For example, 
the origin is a multiple point of the lemniscate (Volume I, p. 102) 


(x? + y?) — 2a2(x? — y?) = 0. 


It is clear that in the neighborhood of a multiple point we cannot 
express the equation of the curve uniquely in the form y = f(x) or x = 


g(y). 
An example of a singularity that is not a multiple point is furnished 


by the cubic curve 
F(x, y) = y8 — x? = 0. 


(see Fig. 3.5). Here at the origin F; = Fy = 0. Solving for y, we can 
put the equation of the curve into the form 


y = f(x) = ¥x?, 


where f is continuous but not differentiable at the origin. The curve 
has a cusp at that point. 
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Figure 3.5 The curve y3 — x? = 0. 


A curve can be regular at a point where both Fz and Fy vanish. This 
is exemplified by 


F(x, y) = y? — x*= 0. 
Here again Fz = Fy = 0 at the origin. But solving for y, we find 
y = f(x) = Yxi, 


where f(x) is continuously differentiable for all x. Thus, the origin is 
a regular point. Since F is an even function of x, the curve is sym- 
metric with respect to the y-axis. It is convex and touches the x-axis 
at the origin, like the parabola y = x?. Yet the origin is a somewhat 
special point for the curve, since there f” becomes infinite, and there 
the curve has infinite curvature. 

The trivial example of the equation 


F(x, y) = (y — x)? = 0 


representing the straight line y = x shows that no peculiar behavior 
has to be associated with points of a curve F(x, y) = 0 for which 
F,? + F, = 0. We shall treat singular points more systematically 
in Appendix 3. 


Exercises 3.2b 


1. Discuss the singular points of the following curves at the origin: 
(a) F(x, y) = ax® + by? — cxy = 0 
(b) F(x, y) = (y? — 2x)? — x5 = 0 
(c) F(x, y) = (1 + e1*)y —x = 0 


238 Introduction to Calculus and Analysis, Vol. II 


(d) F(x, = ya — x) — x? =0 
(e) F(x, y) = (y — 2x)? — xë = 0. 
2. The curve x3 + y3 — 3axy = 0 has a double point at the origin. What are 


its tangents there? 


3. Draw a graph of the curve (y — x?)2 — xë = 0, and show that it has a 
cusp at the origin. What is the peculiarity of this cusp as compared with 
the cusp of the curve x? — y3 = 0? 


4. Show that each of the curves 
(x cos « — y sin « — b) = c(x sin « + y cos «)?, 
where « is a parameter and b, c constants, has a cusp and that the cusps 
all lie on a circle. 


5. Let (x, y) be a double point of the curve F(x, y) = 0. Calculate the angle ¢ 
between the two tangents at (x, y), assuming that not all the second 
derivatives of F vanish at (x, y). Find the angle between the tangents at 
the double point 


(a) of the lemniscate, 
(b) of the folium of Descartes (cf. p. 224). 
6. Find the curvature at the origin of each of the two branches of the curve 
y(ax + by) = cx? + ex?y + fxy? + gy?. 


c. Implicit Representation of Surfaces 


Hitherto, we have usually represented a surface in x, y, z-space by 
means of a function z = f(x, y). For a given surface in space the pref- 
erence for the coordinate z implied in this representation may prove 
inconvenient. It is more natural and more general to represent sur- 
faces in space implicitly by equations of the form F(x, y, z) = 0 or 
F(x, y, z) = constant. For example, it is better to represent a sphere 
about the origin by the symmetric equation x? + y? + 22—r2=0 
than by z = + vyr? — x2 — y2. The explicit representation of the sur- 
face appears then as the special implicit representation F(x, y, z) = 
z —f (x, y ) = 0. 

In order to derive the equation of the tangent plane at a point P 
of the surface F(x, y, z) = 0, we make the assumption that at that point 


(18) Fz + Fy? + F2 #0, 


that is, that at least one of the partial derivatives is not 0.1 If, say, 
F: + 0, we can find an explicit equation z = f(x, y) for the surface near 
P. The tangent plane at P has the equation 


1Just as for curves, the vanishing of the gradient of F usually corresponds to singular 
behavior of the surface. We shall not discuss the nature of such singularities. 
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(19a) E = 2 = É — x)fe + (n— fy 


in running coordinates &, n, 6 (see p. 47). Substituting for the deriva- 
tives of f their values fz = — Fz/Fz, fy = —F,|Fz in accordance with 
formulae (9a), p. 229, we obtain the equation of the tangent plane in the 
form 


(19b) (§ — x)Fr + (n — y)Fy + Ç — 2)F: = 0. 


The normal to the tangent plane (19b) has the same direction as the 
gradient vector (Fz, Fy, Fz) (see p. 134). Hence, the direction cosines 
of the normal are given by the expressions 


Fz Fy 


G90)  cosa= TRE Fp + Fe? 8 PS VRR Fy t Fe 


Oo P 
oS Y = JF + Fy + Fe’ 
Here, more precisely, we have taken that normal of the plane that 
points in the direction of increasing F (see p. 206). 

If two surfaces F(x, y, z) = 0 and G(x, y, z) = 0 intersect at a point, 
the angle w between the surfaces is defined as the angle between their 
tangent planes or, what is the same thing, the angle between their 
normals. This is given by 


FtGr + FyGy + F2Gz 


(20a) cos © = JEA F Pe 4 Fè VG24G,4 Ge . 


In particular, the condition for perpendicularity (orthogonality) is 


Instead of a surface given by an equation F(x, y, z) = 0, we may con- 
sider more generally surfaces given by F(x, y, z) = c, where c is a con- 
stant. Different values of c yield different level surfaces of the function 
F (see p. 15). At any point (x, y, z) the gradient vector (Fz, Fy, Fz) 
is normal to the level surface passing through that point. Similarly, 
equation (19b) gives the tangent plane to the level surface. 

As an example, we consider the sphere 


x? + y? + 22 = r?, 


By (19b), the tangent plane at the point (x, y, z) is 
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(& — x)2x + m — y)ay + (6 — 2)2z = 0 
or 
Ex + ny + Cz = r?. 


The direction cosines of the normal are proportional to x, y, z, that is, 
the normal coincides with the radius vector drawn from the origin 
to the point (x, y, 2). 
For the most general ellipsoid with the coordinate axes as principal 
axes 
x y z 


a tpt} 


the equation of the tangent plane is 


cx ny , kz 
i 


Exercises 3.2c 


1. Find the tangent plane 
(a) of the surface 


x3 + 2xy? — Tz? + 38y+1=0 


at the point (1, 1, 1); 
(b) of the surface 


(x? + y?)? + x2 — y? + Txy + 3x + zt — z = 14 
at the point (1, 1, 1); 
(c) of the surface 
sin? æ + cos (y + 2) =| 
at the point (7/6, 7/3, 0). 
(d) of the surface 
1 + x cos rz + y sin nz — z? = 0 


at the point (0, 0, 1); 
(e) of the surface 


cos x+cosy+2sinz=0 


at the point (0, 0, —x/2); 
(f) of the surface 


x? + y? = 22 + sin zZ 


at the point (0, 0, 0). 
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2. Prove that the three surfaces of the family of surfaces 


Mou VER + VP Fea, VEF VPP A= w 
that pass through a single point are orthogonal to one another. 


3. The points A and B move uniformly with the same velocity, A starting 
from the origin and moving along the z-axis, B starting from the point 
(a, 0, 0) and moving parallel to the y-axis. Find the surface generated 
by the straight lines joining them. 


4. Show that the tangent plane at any point of the surface x? + y? — z2? = 1 
meets the surface in two straight lines. 


5. If F(x, y, z) = 1 is the equation of a surface, F being a homogeneous 
function of degree h, then the tangent plane at the point (x, y, z) is given 
by 


ER, + Fy + Fz =h. 
6. Let z be defined as a function of x and y by the equation 
x3 + y3 + 23 — 3xyz = 0. 
Express Zz and zy as functions of x, y, z. 


7. Find the angle of intersection of the following pairs of surfaces, at the 
indicated points: 


(a) 2x4 + 3y? — 422 = —4, 1 + x2 + y? = 2?, at (0, 0, 1) 

(b) x! + y? = 2, cosh (x + y — 2) + sinh (x + z — 1) = 1, at (1, 1, 0) 
(c) x? + y? = e, x? + z? = e”, at (1, 0, 0) 

(da) 1+ sinh (x/Vz) = cosh (y/Vz), x2 + y2 = 22 — 1, at (0, 0, 1) 

(e) cos r(x? + y) + sin r(x? + z) = 1, x3 + y3 = z? at (0, 0, 0). 


3.3 Systems of Functions, Transformations, and Mappings 


a. General Remarks 


The results we have obtained for implicit functions now enable us 
to consider systems of functions, that is, to discuss several functions 
simultaneously. In this section we shall ccnsider the particularly im- 
portant case of systems in which the number of functions is the same 
as the number of independent variables. We begin by investigating the 
meaning of such systems in the case of two independent variables. 
If the two functions 


(21a) E = g(x,y) and n= y(x, y) 


are both continuously differentiable in a set R of the x, y-plane, the 
domain of the functions, we can interpret this system of functions in 
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two different ways. The first (“active”) interpretation is by means of a 
mapping or transformation. (The second, as a coordinate transforma- 
tion, will be discussed on p. 246). To the point P with coordinates (x, y) 
in the x, y-plane there corresponds the image point IT with coordinates 


(E,n) in the &, n-plane. 
An example is the affine mapping or transformation 


E = ax + by, = cx + dy 


where a, b, c, d are constants (see p. 148). 

Frequently (x, y) and (€, n) are interpreted as points of one and the 
same plane. In this case we speak of a mapping, or a transformation of 
the x,y-plane into itself. 

The fundamental problem connected with a mapping is that of its 
inversion, the question whether and how x and y can in virtue of the 
equations € = ¢(x, y) and n = w(x, y) be regarded as functions of € and 
yn and how to determine properties of these inverse functions. 

If for (x, y) varying over the domain R of the mapping the images 
(€, n) vary over a set B in the €, n-plane, we call B the image set of R 
or the range of the mapping. If two different points of R always corre- 
spond to two different points of B, then for each point (&, n) of B there is 
a single point (x, y) of R for which (&, n) is the image. (The point (x, y) 
is called the inverse image, as opposed to the image). That is, we can in- 
vert the mapping uniquely, determining x and y as functions 


(21b) x=g6n), y= h&,n), 


which are defined in B. We then say that the mapping (2la) has a 
unique inverse or is a 1-1 mapping, and we call the transformation 
(21b) the inverse mapping or transformation of the original one. 

If in this mapping the point P = (x, y) describes a curve in the 
domain R, its image point (€, n) usually will likewise describe a curve 
in the set B, which is called the image curve of the first. For example, 
to:the line x = c, which is parallel to the y-axis, there corresponds in 
the £, n-plane the curve given in parametric form by the equations 


(22a) E = (c, y), n= vc, y), 


where y is the parameter. Again, to the line y = k there corresponds 
the curve 


(22b) E = g(x, k), n= w(x, k). 
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If to c and k we assign sequences of equidistant values c1, C2, C3, . . . 
and kı, k2, ks, . . ., then the rectangular “coordinate net” consisting 
of the lines x = constant and y = constant (e.g., the network of lines 
on ordinary graph paper) gives rise to a corresponding net of curves, 
the curvilinear net, in the &,7-plane (Figs. 3.6 and 3.7). The two 
families of curves can be written in implicit form. If we represent the 
inverse mapping by the equations (21b), the equations of the curves 
are simply 


Cz Cs 


Figure 3.6 and Figure 3.7 Nets of curves x = constant and y = 
constant in the x, y-plane and the €, n-plane. 


(22c) g&n) =c and AE,n) =k, 


respectively. In many situations the curvilinear net furnishes a useful 
geometric picture of the mapping (21a) preferable to the interpretation 
of the equations as a two-dimensional surface in four-dimensional 
x, Y, ©, N-space. 

In the same way, the two families of lines & = y and n = xin the &, 
n-plane correspond to the two families of curves 


g(x,y) =y and wy(x,y)=« 


in the x, y-plane. 

As an example, we consider the inversion (also called mapping by 
reciprocal radii or reflection with respect to the unit circle). This trans- 
formation is given by the equations 


— _ J 


(23a) 6 


To the point P = (x, y) there corresponds the point H = (&, n) lying on 
the same ray OP and satisfying the equation 
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(23b) E2 + n? = or Ol = 5 


_1_ 
x? + y? 


thus, the length of the position vector OP is the reciprocal of the 


length of the position vector Ol. Points inside the unit circle x? + y? 
= 1 are mapped on points outside the circle and vice versa. From (23b) 
we find that the inverse transformation is. . 


_ _§ __ 7 
X= E24 n?’ Y= E24 yp? 


which is again an inversion; that is, the inverse image of a point coin- 
cides with its image. 

For the domain R of the mapping (23a) we may take the whole x, y- 
plane with the exception of the origin, and for the range B the whole 
E, n-plane with the exception of the origin. The lines € = y and n = xK 
in the £, n-plane correspond to the respective circles 


+y > x =0 and x2 + y?—Zy =0 


in the x, y-plane. In the same way, the rectilinear coordinate net in 
the x, y-plane corresponds to the two families of circles touching the 
E-axis and n-axis at the origin. 

As a further example we consider the mapping 


E = x? — y?, n = 2xy. 


The curves & = constant give rise in the x, y-plane to the rectangular 
hyperbolas x? — y? = constant, whose asymptotes are the lines x = y 
and x = — y. The lines ņ = constant also correspond to a family of 
rectangular hyperbolas having the coordinate axes as asymptotes. 
The hyperbolas of each family cut those of the other family at right 
angles (Fig. 3.8). The lines parallel to the axes in the x, y-plane corre- 
spond to two families of parabolas in the &, n-plane, the parabolas n? = 
4c?(c? — &) corresponding to the lines x = c and the parabolas n? = 
4k?(k2 + £) corresponding to the lines y = k. All these parabolas have 
the origin as focus and the &-axis as axis; they form a family of 
confocal and coaxial parabolas (Fig. 3.9). 

One-one transformations have an important interpretation and ap- 
plication in the representation of deformations or motions of continu- 
ously distributed substances, such as fluids. If we think of such a sub- 
stance as spread out at a given time over a region R and then deformed 
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Figure 3.8 Orthogonal families of rectangular hyperbolas. 


Figure 3.9 Orthogonal families of confocal parabolas. 


by a motion, the substance originally spread over R will in general 
cover a region B different from R. Each particle of the substance can 
be distinguished at the beginning of the motion by its coordinates 
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(x, y) in R and at the end of the motion by its coordinates (E, n) in B. 
The 1-1 character of the transformation obtained by bringing (x, y) 
into correspondence with (£, n) is simply the mathematical expression 
of the physically obvious fact that separate particles remain separate. 


Exercises 3.3a 

1. Find the image curves of the lines x = const., y = const. under the 

following transformations: 

(a) § = e? cos y, 7 = e? sin y 

(b) £ = (x — y)/2, n= Vxy 

(c) & = vx/y, n= cos(x + y) 

(d) é=x +y, n=y+x—1 

(e) E= x”, y=y" 

(f£) § = sinh x, n = cosh y 

(g8) & = sin(x +y) n = cos(x — y) 

(h) & = es z, n = esin y, 


2. Find the image of the region bounded by the curve cosh? x + sinh? y = 1 
under the mapping & = e7, n = e¥. 

3. Find the image of the rectangle 1 <x $3, 4 <y <16, under the 
mapping E = Vx Fy, n=vy—x. 

4. Is the transformation § = x — xy, n = 2xy one-to-one? 


b. Curvilinear Coordinates 


Closely connected with the first interpretation (as a mapping) of 
the system of equations & = f(x, y), į = w(x, y) is the second interpreta- 
tion as a transformation of coordinates in the plane. If the functions 
ọ and y happen not to be linear, this is no longer an “affine” trans- 
formation but a transformation to general curvilinear coordinates. 

We again assume that when (x, y) ranges over a region R of the 
x, y-plane the corresponding point (£, 1) ranges over a region B of the 
E, n-plane and also that for each point of B the corresponding (x, y) 
in R can be uniquely determined; in other words, that the transfor- 
mation is 1-1. The inverse transformation we again denote by x= 
g6, n), y = ACS, n). 

By the coordinates of a point P in a region R we now mean any 
number-pair that serves to specify the position of the point P in R 
uniquely with respect to a given coordinate frame. Rectangular coordi- 
nates form the simplest system of coordinates that extend over the 
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whole plane. Another familiar system is the system of polar coordi- 
nates in the x, y-plane, introduced by the equations 


E=r= VEF 
n = 0 = arc tan y/x (0 <80 < 2r). 


When we are given a system of functions € = g(x, y), n = w(x, y) 
as above, we can in general assign to each point P(x, y) the corre- 
sponding valuės (£, n) as new coordinates, for each pair of values (E, n) 
belonging to the region B uniquely determines the pair (x, y), and, 
thus, uniquely determines the position of the point P in R. The “‘co- 
ordinate lines” & = constant and n = constant are then represented 
in the x, y-plane by two families of curves, which are defined implicitly 
by the equations g(x,y) = constant and y(x,y) = constant, respec- 
tively. These coordinate curves cover the region R with a coordinate 
net (usually curved), for which reason the coordinates (€,n) are also 
called curvilinear coordinates in R. 

We shall once again point out how closely these two interpreta- 
tions of our system of equations are interrelated. The curves in the 
€,y-plane that in the mapping correspond to straight lines parallel 
to the axes in the x, y-plane can be directly regarded as the coordinate 
curves for the curvilinear coordinates x = g(&,n), y = A(E, n) in the 
E, n-plane; conversely, the coordinate curves of the curvilinear system 
E= A(x, y), n = w(x, y) in the x, y-plane in the mapping are the images 
of the straight lines parallel to the axes in the €, n-plane. Even in the 
interpretation of (€,n) as curvilinear coordinates in the x,y-plane, 
we must consider a &,n-plane and a region B of that plane in which 
the point with the coordinates (€,n) can vary if we wish to keep the 
situation clear. The difference is mainly in the point of view.! If we are 
chiefly interested in the region R of the x, y-plane, we regard &, n 
simply as a new means of locating points in the region R, the region 
B of the €, -plane being then merely subsidiary; while if we are equal- 
ly interested in the two regions R and B in the x,7-plane and the €, n- 
plane, respectively, it is preferable to regard the system of equations 
as specifying a correspondence between the two regions, that is, a 
mapping of one on the other. It is, however, often desirable to keep the 
two interpretations, mapping, and transformation of coordinates, 
in mind at the same time. 


1There is, however, a real difference, in that the equations always define a mapping, 
no matter how many points (x, y) correspond to one point (&, n), while they define a 
transformation of coordinates only when the correspondence is 1-1. 
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If, for example, we introduce polar coordinates (r, 0) and interpret 
r and ð as rectangular coordinates in an r,06-plane, the circles r = 
constant and the lines 8 = constant are mapped on straight lines 
parallel to the axes in the r, 8-plane. If the region R of the x, y-plane is 
the circle x? + y? < 1, the point (r, 0) of the r, 6-plane will range over 
arectangleOQ<rsi1, 0 <60 < 2r, where corresponding points of the 
sides ð = 0 and 0 = 2x are associated with one and the same point of 
FR and the whole side r = 0 is the image of the origin x = 0, y = 0. 

Another example of a curvilinear coordinate system is the system 
of parabolic coordinates. We arrive at these by considering the family 
of confocal parabolas in the x, y-plane (cf. also p. 234 and Fig. 3.9) 


2 — 5] 
y 2e(x +$), 


all of which have the origin as focus and the x-axis as axis. Through 
each point of the plane but the origin there pass two parabolas of the 
family, one corresponding to a positive parameter value c = € and the 
other to a negative parameter value c = n. We obtain these two values 
by solving for c the quadratic equation y? = 2c(x + c/2) using the 
values of x and y corresponding to the point; this gives 


G=—xXt ve + yy, N= — x va? + y. 


These quantities € and n may be introduced as curvilinear coordinates 
in the x, y-plane, the confocal parabolas then becoming the coordinate 
curves. These are indicated in Fig. 3.9 if we imagine the symbols (x, y) 
and (¢, n) interchanged. 

In using parabolic coordinates (E, n) we must bear in mind that the 
one pair of values (€, n) corresponds to two points (x, y) and (x, —y), 
the two intersections of the corresponding parabolas. Hence, in order 
to obtain a 1-1 correspondence between the pair (x, y) and the pair 
(E, n), we must restrict ourselves to a half-plane, y 2 0, say. Then every 
region œ in this half-plane is in 1-1 correspondence with a region B 
of the €, n-plane, and the rectangular coordinates (£, n) of each point in 
this region B are exactly the same as the parabolic coordinates of the 
corresponding point in the region R. 


Exercises 3.3b 


1. Prove that for x + 1,0 < y < x/2, & = (sin y)/(x — 1), ņn = x tan y, define a 
system of curvilinear coordinates. 
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2. Find the equation for the circle x? + y? = 1 in terms of the curvilinear 
coordinates 
E = x3 + 1, n= xy. 


3. For what points of the x, y-plane can we not use § = xy and yj = x? + y? 
as curvilinear coordinates? 


c. Extension to More Than Two Independent Variables 


For three or more independent variables the state of affairs is an- 
alogous. Thus, a system of three continuously differentiable functions 


E = g(x, y, z) n = ylx, y, z) 6 = x(x, y, 2), 


defined in a region È of x, y, z-space, may be regarded as the mapping 
of the region RÈ on a region B of £, n, ¢-space. If this mapping of R on 
B is 1-1, so that for each image point (E, n, S) of B the coordinates 
(x, y, z) of the corresponding point (original point or inverse image) in 
R can be uniquely calculated by means of functions 


x= gÉ nE) y= hÉ, n0), z= &,n, 0), 


then (€, n, 6) may also be regarded as general coordinates of the point 
P in the region R. The surfaces € = constant, n = constant, ¢ = con- 
stant, or, in other symbols, 


g(x, y, z) = constant, y(x, y, z) = constant, x(x, y, z) = constant, 


then form a system of three families of surfaces that cover the region 
R and may be called curvilinear coordinate surfaces. 

Just as for two independent variables, we can interpret 1-1 trans- 
formations in three dimensions as deformations of a substance spread 
continuously throughout a region of space. 

A very important system of coordinates are the spherical coordi- 
nates, sometimes called polar coordinates in space. These specify the 
position of a point P in space by three numbers: (1) the distance r = 
vx? + y? + 22 from the origin; (2) the geographical longitude ¢, that 
is, the angle between the x, z-plane and the plane determined by P and 
the z-axis; and (3) the polar inclination or complementary latitude 
0, that is, the angle between the radius vector OP and the positive 
z-axis. As we see from Fig. 3.10, the three spherical coordinates r, ¢, 0 
are related to the rectangular coordinates by the equations of trans- 
formation 
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Figure 3.10 Spherical coordinates. 


x =rcos¢ sin 9, 
y = r sin ø sin 9, 
z = r cos 9, 


from which we obtain the inverse relations 
r= Vx + y? + 2? 


ý = arc cos = arc sin 


_ x yV 

Vx? + y? Vx + y? 

0 = arc cos — 2- arc sin vee + ye 
T V + y2 + Zz? Vx? + y? + 2? 


For polar coordinates in the plane the origin is an exceptional point 
in that the 1-1 correspondence fails because the angle is indeter- 
minate there. In the same way, for spherical coordinates in space the 
whole of the z-axis is an exception in that the longitude ¢ is indeter- 
minate there. At the origin itself the polar inclination 0 is also indeter- 
minate. 

The coordinate surfaces for three-dimensional polar coordinates 
are as follows; (1) for constant values of r, the concentric spheres 
about the origin; (2) for constant values of ¢, the family of half-planes 
through the z-axis; (3) for constant values of 0, the circular cones with 
the z-axis as axis and the origin as vertex (Fig. 3.11). 

Another coordinate system that is often used is the system of 
cylindrical coordinates. These are obtained by introducing polar co- 
ordinates p, ¢ in the x, y-plane and retaining z as the third coordinate. 
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Figure 3.11 Coordinate surfaces for spherical coordinates. 


Then the formulae for transformation from rectangular coordinates 
to cylindrical coordinates are 


x =p cos @, 
y= p sing, 
Zak 

and the inverse transformation is 


p= Vx? + y? 


= S E EREN 
g = arc cos lere” arc sin “ps3 Fy 
z=z. 


The coordinate surfaces p = constant are the vertical circular cyl- 
inders that intersect the x, y-plane in concentric circles with the 
origin as center; the surfaces ¢ = constant are the half-planes 
through the z-axis, and the surfaces z = constant are the planes paral- 
lel to the x, y-plane. 


Exercises 3.3c 


1. Find the inverse of the curvilinear coordinate transformation 


gae a e a e 
T xR yy?’ 15 Apy 4 22? RHH y2 4 22? 
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2. Invert the coordinate transformation w = r cos ¢, x =r sin ¢ cos 4, 
y = r sin ġ sin Ņ cos 9, z = r sin ġ sin sin 9. What are the sets r = con- 
stant, ¢ = constant, » = constant, 0 = constant? 


d. Differentiation Formulae for the Inverse Functions 


In many cases of practical importance it is possible to solve the 
given system of equations explicitly, as in the above examples, and 
thus to recognize that the inverse functions are continuous and pos- 
sess continuous derivatives. If we may presume the existence and dif- 
ferentiability of the inverse functions, we can calculate the deriva- 
tives of the inverse functions without actually solving the equations 
explictly in the following way: We substitute the inverse functions 
x = g(6, n), y = h(&, n) in the given equations € = g(x, y), n = w(x, y). 
On the right we obtain the compound functions ¢(g(E, n), ACE, n)) and 
w(g(E, n), ACE, n)) of & and n; but these must be equal to € and n, respec- 
tively. We now differentiate each of the equations 


(24a) E = dg(§, n), ACE, n)) 
n = w(g(E, n), ACE, n)) 


with respect to € and to n, regarding € and n as independent variables? 
and applying the chain rule to differentiate the compound functions. 
We then obtain the system of equations 


(24b) 1 = rge + dyhe, 0= zgn + dyhn, 
0 = Wage + Wyhz, 1 = Wagn + Wyhn. 


Solving these equations, we obtain expressions for the partial deriva- 
tives of the inverse functions x = g(E, n) and y = A(E, n) with respect 
to é and n, expressed in terms of the derivatives of the original func- 
tions g(x, y) and y(x, y) with respect to x and y, namely, 


(24c) g=#, a= - %, h=-, hy =% 


or 


1These equations hold for all values of & and n under consideration; as we say, they 
hold identically, in contrast to equations between variables that are satisfied only 
for some of the values of these variables. Such identical equations or identities, when 
differentiated with respect to any of the variables occurring in them, again yield 
identities as follows immediately from the definition. 
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_ Ty _ — -z — $z 
(24d) w= DH? Xy = D’ y= D’ mu = p 
For brevity we have here written 
(24e) 06 96 
D=t ene = Ox oy 
= Czy ya = an an 
ox doy 


This expression D, which we assume is not zero at the point in ques- 
tion, is called the Jacobian or functional determinant of the functions 
E = (x, y) and n = y(x, y) with respect to the variables x and y. It 
plays a major role wherever we consider transformations, as will 
become apparent in the sequel. 

Above, as occasionally elsewhere, we have used the shorter notation 
E(x, y) instead of the more detailed notation & = g(x, y), which dis- 
tinguishes between the quantity € and its functional expression 
g(x, y). We shall often use similar abbreviations in the future when 
there is no risk of confusion. 

For polar coordinates in the plane expressed in terms of rectangular 
coordinates, 


E =r =V? + y? and n = 0 = arc tan”, 


the partial derivatives are 


Xx x 


ee _ ee ee 
re xe ye? TY yh tye r’ 


— ZY __y x x 


Hence, the Jacobian has the value 


and the partial derivatives of the inverse functions (rectangular co- 
ordinates expressed in terms of polar coordinates) are, by (24d), 
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as we could have found more easily by direct differentiation of the in- 
verse formulae x = r cos 9, y = r sin 9. 

The Jacobian occurs so frequently that a special symbol is often 
used for itt: 


_ a, n) 
(25) D= dy) 


The appropriateness of this abbreviation will soon be obvious. From 
the formulae for the derivatives of the inverse functions (24b), we find 
that the Jacobian of the functions x = x(E, n) and y = y(&, n) with 
respect to & and n is given by the expression 

d(x, y) _ _ } _ Eany—Snz _ 1 _ (d€&, m) 1 
(ie D TD e 


That is, the Jacobian of the inverse system of functions is the reciprocal 
of the Jacobian of the original system.? 

We can also express the second derivatives of the inverse system 
of functions in terms of the first and second derivatives of the given 
functions. We have only to differentiate the linear equations (24b) 
with respect to & and to n by means of the chain rule. (We assume, of 
course, that the given functions possess continuous derivatives of the 
second order.) We then obtain linear equations from which the re- 
quired derivatives can readily be calculated. 

For example, to calculate the derivatives 


02 02 
aE =ge and 362 = he 


we differentiate the two equations 


1 = Exxe + Euys 
0 = Hrxe + NyVé 


once again with respect to € and by the chain rule obtain 


(27a) 0 = Err? + Wayxeye + Eyyye? + Faxes + Sues, 
10ften the Jacobian is written with the partial derivative sign as 
_ a, n) 
P= ax, y) ` 


2This, of course, is the analogue for the rule for the derivative of the inverse of a 
function of a single variable (Volume I, p. 207). 
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(27b) O = Narre? + WayXeye + Nyyye? + Nexe + Eyes. 


If we solve this system of linear equations, regarding the quantities 
xee and yee as unknowns (the determinant of the system is again D, 
and therefore, by hypothesis, not zero) and then replace x; and ye by 
the values already known for them, a brief calculation gives 


1 | zany? — Wry Qeny + Synz? Ey 
(27c) xe = — Hp > a 
NaaNy® — 2cyNeny + NyyNs” Ny 
and 
1 | Sexy? — 2Ecynzny + Synz? = Ex 
(27d) Ye = Hs - > 
NacNy* — 2NeyNeny + NyyNr? Nez 


The third and higher derivatives can be obtained in the same way, 
by repeated differentiation of the linear system of equations; at each 
stage we obtain a system of linear equations with the nonvanishing 
determinant D. 


Exercises 3.3d 


1. Find the Jacobians of the following transformations: 
(a) č = ax + by, n = cx + dy 
(b) r=Ẹvx? + y2, 0 = arc tan y/x 
(c) =x, n=y? 
(d) & = 4 log (x? + y?), 7 = arc tan Z 
(e) 6 = xy?, n= xy 
(ff) E= x3 — y, yn=yt+ x. 


2. For each of the transformations given in Exercise 1, give the points 
(x, y) lacking neighborhoods where the transformation has an inverse. 


3. Find the Jacobian of the transformation & = f(x, y), n = g(x, y), as well 
as all partial derivatives of x, y with respect to £, 7 through those of 
second order, in each of the following cases: 


(a) § = e? cos y, n = e? sin y 

(b) § = x? — y?, n= 2xy 

(c) & = tan (x + y), n = cos (x — y), —r|2 < x + y < n/2 
(d) § = sinh x + cosh y, 7 = —cosh x + sinh y 

(e) E= x? + y8, q= xy? 
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4, A transformation is said to be “conformal” (see p. 288) if the angle 
between any two curves is preserved 
(a) Prove that the inversion 


—— * __ J 
is a conformal transformation; 

(b) prove that the inverse of any circle is another circle or a straight 
line; 

(c) find the Jacobian of the inversion. 


5. Let Ki, K2, Ks be three circles passing through 0 and having distinct 
pairwise intersections, say Pi, Pe, Ps, at other points. Show that the 
sum of the angles of the curvilinear triangle Pı P2 Ps, formed by circular 
arcs, 1S 7. 

6. A transformation of the plane 


u = ẹ(x, y), v= (x,y) 
is conformal if the functions 9 and yọ satisfy the identities 


Pz = by, Py = — bz. 


7. Prove that if all the normals of a surface z = u(x, y) meet the z-axis, 
then the surface is a surface of revolution. 


8. The equation 


g 


4 a (a >b) 
determines two values of t, depending on x and y: 
tı = A(x, y), 
te = u(x, y). 


(a) Prove that the curves tı = constant and t2 = constant are ellipses 
and hyperbolas all having the same foci (confocal conics). 


(b) Prove that the curves tı = constant and tz = constant are orthogo- 
nal. 


(c) tı and t2 may be used as curvilinear coordinates (so-called focal 
coordinates). Express x and y in terms of these coordinates. 


(d) Express the Jacobian 0(f1, t2)/0(x, y) in terms of x and y. 


(e) Find the condition that two curves represented parametrically in 
the system of focal coordinates by the equations | 


ti = fiQ), te = fe) and ti1=g1(u), te = ge(u) 
are orthogonal to one another. 
9. (a) Prove that the equation in ¢ 


y? + 2 


=1 (a>b>c) 


x2 
it c—t 


a— b—t 
has three distinct real roots tı, tz, fs, which lie respectively in the 
intervals 
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—o<t<ec, c<t<b, b<t<a, 


provided that the point (x, y, z) does not lie on a coordinate plane. 
(b) Prove that the three surfaces tı = constant, t2 = constant, t3 = con- 
stant passing through an arbitrary point are orthogonal to one an- 


other. 
(c) Express x, y, z in terms of the focal coordinates tı, t2, t3. 


10. Prove that the transformation of the x, y-plane given by the equations 


_if,, * | __1 oJ 
s= ale + aA) 1=3(9-— ay 
(a) is conformal; 


(b) transforms straight lines through the origin and circles with the 
origin as center in the x, y-plane into confocal conics ¢ = constant 


given by 
E2 n 
t+1/27t—1/2 
11. For & = f(x,y), n = g(x,y), and D = 0(,n)/0(x,y) + 0, demonstrate the 
identities 
(a) 0D _ OEy,n) , IE, ny) 
dy Ox, y) — x,y) ’ 
(b) D- [Ex(nyy D — nyDy) — Ey(nzyD — nyDz)] 
= D> [nzEyyD — byDy) — ny(ExyD — &yDz)]. 


l. 


e. Symbolic Product of Mappings 


We begin with some remarks on the composition of transformations. 
If the transformation 


(28a) E = g(x, y), n= y(x, y) 


gives a 1-1 mapping of the points (x, y)of a region Ron points (E, n) of 
the region B in the €, n-plane and if the equations 


(28b) u=E,n), v= P(E, n) 


give a 1-1 mapping of the region B on a region R’ in the u, v-plane, 
then a 1-1 mapping of Ron Rf’ is generated. This mapping we naturally 
call the resultant mapping or transformation and say that it is obtained 
by composition of the two given mappings and that is represents their 
symbolic product. The resultant transformation is given by the equa- 
tions 


u = (glx, y) w(x, y)), v= (g(x, y), W(x, y)); 


from the definition, it follows at once that this mapping is 1-1. 
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By the rules for differentiating compound functions, we obtain 


du ðu 

(29a) an Debs + DnVz, ay ~ Dedy + DnWy, 
ð ð 

(29b) ae = Peds + Ynyr, Dy = Pedy + Py. 


In matrix notation (p. 152) 


du ðu 
0x oO D: @D 

o [FF] (2 aye 4) 
ðv dv Pe Yn /\Yz Wy 
ox dy 


On comparing this with the law for the multiplication of determinants 
(cf. p. 172) we find! that the Jacobian of u and v with respectto x and 


y 1s 


(31a) ax ay — ay ax = (OY, — D,'Pe)(bcWy — pyVz). 

In words, the Jacobian of the symbolic product of two transformations 
is equal to the product of the Jacobians of the individual transformations, 
namely, in the notation (25), 


d(u, v) _ dlu, v) ad, n) 
d(x,y) d&,n) d(x, y) ` 


This equation brings out the appropriateness of our symbol for the Ja- 
cobians. When transformations are combined, the Jacobians behave 
in the same way as the derivatives behave when functions of one variable 
are combined. The Jacobian of the resultant transformation differs 
from zero, provided the same is true for the individual (or component) 
transformations. 

If, in particular, the second transformation 


u = É, n), v= PE, n) 


is the inverse of the first, 


(31b) 


E = g(x, y), n= y(x, y) 


1The same result can, of course, be obtained by straightforward multiplication. 
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and if both transformations are differentiable, the resultant transfor- 
mation will simply be the identical transformation; that is, u = x, 
v = y. The Jacobian of this last transformation is obviously 1, so that 
we again obtain the relation (26). 

From this, incidentally, it follows that neither of the two Jacobians 
can vanish: 


d&, n) d(x,y) _ 
d(x,y) d&n) 


For a pair of continuously differentiable functions g(x, y) and y (x, y) 
that has a nonvanishing Jacobian, we can find formulae for the 
corresponding mapping of directions at a point (xo, yo) = Po. A curve 
passing through Po can be described parametrically by equations x = 
fÐ, y = g(t), where f(to) = xo, g(to) = yo. The slope of the curve at Po 
is given by 


_ g(to) 
m = Fto) ` 


Similarly, the slope of the image curve 


E = ofe), n= v(f(t),g) 


at the point corresponding to Po is 


(32) „ = ndt _ Waf’ + wus" _ ¢ + dm 
d&/dt dxf’ +dyg’ a+ bm’ 


where a, b, c, d are the constants 
a = $x(X0, yo), b = Py(Xo0, yo), C = W2x(Xo, yo), d = Wy(Xo, yo). 


The relation (32) between the slope m of the original curve at Po and 
the slope p of the image curve is the same as for the affine mapping 


& = d(x, yo) + a(x — xo) + b(y — yo), 
n = y(Xo, yo) + elx — xo) + d(y — yo). 
that approximates our mapping near Po. Since 


du _ ad — bc 
dm (a + bm)?’ 
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we find that u is an increasing function of m for ad — bc > 0 and a de- 
creasing function for ad — bc < 0.1 

Increasing slopes correspond to increasing angles of inclination 
or to counterclockwise rotation of the corresponding directions. Thus, 
du/dm > 0 implies that the counterclockwise sense of rotation is pre- 
served, while it is reversed for du/dm < 0. Now, ad — bc is just the 
Jacobian 


Gx Dy 
Wa Wy 


ae, n) _ 
d(x, y) 


evaluated at the point Po. It follows that the mapping § = g(x, y), n = 
w(x, y) preserves or reverses orientations near the point (xo, yo) according 
to whether the Jacobian at that point is positive or negative. 


Exercises 3.3e 


1. For each of the following pairs of transformations find @(u, v)/@(x, y) 
first by eliminating & and n, then by applying (31b): 


1 
(a) f= gle en {5 =e cosy 
v = arc tan 7 n= e* sin y 
u =ë% -n E =x cos y 
(b) pE \p=xsm> 
u=e& cosy, B= x| (x? + y?) 
(c) lp met sin 7 (Fie ty 


2. In which of the following successive transformations can x, y be defined 
as continuously differentiable functions of u, vin a neighborhood of the 
indicated point (Up, Uo)? 

(a) & = e? cos y, n=e* sin y; 

u= E? — nè, v= 2En, uo = 1, vo = 0; 
(b) & = cosh x + sinh y, 7 = sinh x + cosh y, 

u= e, v=e", uo= Vo= 1; 
(c) & = x3 — y3, n= x? + 2xy?; 

u= E5 + n, V = 75 — &; uo = 1, vo = 0. 

3. Consider the transformation 
[u= [5 =r 
v = Y, n) 7 = g(y). 

Show that 


1More precisely, this holds locally, excluding the directions where m or u become 
infinite. 
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o(u, v) — f’ / (u, V) 
a(x, y) T O LOE, a): 


4. If z = f(x, y) and č = 9(x, y), n= (x, y), show that 
dz _ (z,n) / IE, n) 
0— lx, y) | 3(x,y) 
and 
dz _ ae, z) | ae, n) 
dn (x, y)/ (x, y) 
provided O(E, n)/0(x, y) # 0. 


f. General Theorem on the Inversion of Transformations and of 
Systems of Implicit Functions. Decomposition into Primitive 
Mappings 


The possibility of inverting a transformation depends on the 
following general theorem: 

Let o(x, y) and w(x, y) be continuously differentiable functions in a 
neighborhood of a point (xo, yo), for which the Jacobian D = dzWy — by Wx 
is not zero at (xo, yo). Put uo = (xo, yo), Vo = y(xo, yo). Then there 
exists a neighborhood N of (xo, yo) and N’ of (uo, vo) such that the map- 
ping 
(33a) u = (x,y), v = y(x, y) 
has a unique inverse 
(33b) x = g(u, v), y= h(u,v) 
mapping N’ into N. The functions g and h satisfy the identities 


(33c) u = ġ(glu, v), h(u, v)), v = (glu, v), h(u, v)) 
for (u, v) in N', and the equations 
(33d) xo = g(uo, Vo), yo = h(uo, vo). 


The inverse functions g, h have continuous derivatives for (u, v) near 
(uo, vo), given by 


OU Ox 
33e Z = L —~ — 
(33e) y 


1 
? dv D 


ay _1dv dy _ 
(33f) a = 
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The proof follows from the implicit function theorem on p. 228, 
which permits one to solve an equation for a single variable. In es- 
sence, we invert equations (33a) by solving the first equation for one 
of the variables x, y and substituting the resulting expression into the 
second equation, obtaining an equation for the second variable alone. 

Since by assumption the Jacobian D does not vanish at the point 
(xo, yo), at least one of the first derivatives of g(x, y) differs from zero 
at that point. Let, say, ¢2(xo, yo) Æ 0. We can then solve the equation 


(34a) u = g(x, y) 


for x. More precisely, we can find positive constants hı, he, ha such that 
for 


(34b) lu —uol< hı, |y — yol < he 


equation (34a) has a unique solution x = X(u, y) for which|x — xo| < 
hs. The function X(u, y) has the domain (34b) and satisfies the equa- 
tions 


(34c) p(X(u, y), y) = u, X(uo, yo) = xo, 
and the inequality 
(34d) | X(u, y) — xo| < hs. 
Moreover, X(u,y) has continuous derivatives, for which, by (84c), 
(34e) ba X(u, y), y)Xu(u, y) = 1 
(34f) §x(X(u, y), Y)Xy(u, y) + by(X(u, y), y) = 0. 
We assume here that he, hs are so small that the rectangle 
(34g) |x — x0|< hs, ly — yol< he 


lies in the domain of g(x, y), w(x, y). Substituting the expression 
X(u, y) for x into the functions y(x, y), we obtain a compound function 


(34h) w(X(u, y), y) = x(u, y) 
with domain (34b). Here, by (34c, f), 


(341) y(uo, yo) = Y(xo, Yo) = Vo 


Developments and Applications of the Differential Calculus 263 


D 
(343) Xy(Uo, Yo) =WaXy + Wy= -y + Yy = ba Æ 0; 


x 


we have øz = 0 from (34e). It follows that we can find positive con- 
stants ha, hs, he such that for 


(34k) |u —uol|< ha, |vu—vol<hs 
the equation 
(34m) x(u, y) = v 


has a unique solution y = A(u, v), for which |y — yo|< he. We can 
assume here that ha < hi, he < he (see footnote on p. 228). 
Finally, we set 


(34n) X(u, h(u, v)) = glu, v). 


The two functions g(u, v), h(u, v) have the domain (34k). By (84c, h) 
they satisfy the equations 


é(g(u, v), hlu, v)) = (Xu, hlu, v)), hlu, v)) = u 
y(glu, v), hlu, v)) = w(X(u, hlu, v)), hlu, v)) = x(u, A(u, v)) = v 
and the inequalities 
|g(u, v) — xol < hs, |A(u, v) — yo| < he. 
Formulae (33e, f) for the derivatives of g and h were derived earlier, 
on p. 253. 
To show the uniqueness of the inverse functions, assume that x, 


y, u, v is any set of values that satisfy the equations (33a) and the 
inequalities 


|x — xol<hs, |y — yol< he, |u — Uol<ha, |v —vol< hs. 
Since (34a,b) hold, we conclude that 
(340) x = X(u, y). 
From (34h) we obtain the equation 


v = y(x, y) = y(X(u, y), y) = x(u, y), 
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which has the unique solution y = A(u, v). The relation x = g(u, v) 
then follows from (34n, o). The relations (83d) for g and h follow from 
the uniqueness of the solution and the assumption that uo = ¢(xo, yo), 
Vo = W(Xo, yo). 

We have assumed so far that ¢2(xo, yo) = 0. If dx(xo, yo) = 0, but 
dy(xo, yo) Æ 0, the inversion of the mapping (33a) proceeds similarly. 
In this case we solve the first equation of (33a) for y and substitute the 
resulting function y = Y(u, x) into the second equation, obtaining an 
equation for x alone. 

The inversion of the plane mapping (38a) has been reduced to inver- 
sions of mappings in which only one variable is transformed at a time. 
Generally, we call the transformation (33a) primitive, if it leaves one 
of the coordinates unchanged, that is, if either the function g(x, y) 
is identical with x or the function y(x, y) is identical with y. The effect 
of a primitive transformation of the type u = g(x, y), v = y is to move 
each point in the direction of the x-axis, keeping its ordinate un- 
changed. After deformation the point has a new abscissa, which de- 
pends on both x and y. If the Jacobian ø of the primitive mapping is 
positive, u varies monotonically with x for fixed y. 

We shall prove that we can decompose an arbitrary transformation 
(33a) with nonvanishing Jacobian into primitive transformations in a 
neighborhood of a point. This follows readily from our construction of 
the inverse mapping. If ¢2(xo, yo) + 0, we represent the mapping (33a) 
as the symbolic product of the primitive mappings 


(34p) E = g(x, y) n=y 
and 
(34q) u=%, v=x(§,7n). 


Here the domain R of the first mapping in the x, y-plane shall be a rec- 
tangle so small that 


|x — xo|< hs, |y—yol<he, |¢(x, ¥) — uol< hı, 
while the second mapping has the domain 
|E — uo| < hı, |n — yol|< he. 


It follows that the image (E, n) of a point (x, y) of R in the mapping 
(34p), lies in the domain of the mapping (34q) and that 


x = XG, y). 
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Consequently, also 


(34r) x = X(G(x, y), y). 

For the mapping compounded from (34p, q) we then have by (34 h, r) 
u = G(x, y) 
v = X(G(x, y), y) = WX G(x, y), y), y) = W(x, y). 


An analogous decomposition of the mapping (33a) is obtained when 
$x(Xo0, yo) = 0 but ¢,(xo, yo) Æ 0. We only have tointerchange the roles 
of the variables x and y. 

We cannot expect to resolve a transformation into primitive trans- 
formations in one and the same manner throughout the whole open 
region R. However, since some type of decomposition can be carried 
out near each point of R, every bounded closed subset of R can be sub- 
divided into a finite number of sets! such that in each one of those 
sets one of the decompositions is possible. 

The inversion theorem is a special case of a more general theorem 
that may be regarded as an extension of the theorem of implicit func- 
tions to systems of functions. The theorem of implicit functions (p. 
228) applies to the solution of one equation for one of the variables. 
The general theorem is as follows: 


If o(x, y, u, v, . . ., w) and y(x, y, u, U,. . ., w) are continuously 
differentiable functions of x, y, u, v,. . ., w, and the equations 


d(x, ¥,U,U,...,w)=O0 and w(x, y,U,v,...,w)=0 


are satisfied by a certain set of values xo, yo, Uo, Vo,. . ., Wo and if in ad- 
dition the Jacobian of ¢ and y with respect to x and y differs from zero 
at that point(thatis, D = $zWy — $dyWz 4 0), then in the neighborhood of 
that point the equations ¢ = 0 and y = 0 can be solved in one, and only 
one way for x and y, and this solution gives x and y as continuously dif- 
ferentiable functions of u, v,.. ., w. 

The proof of this theorem is similar to that of the inversion theorem 
above. From the assumption D + 0 we can conclude that at the point 
in question some partial derivative does not vanish, say ġz = 0. By the 
main theorem of p. 228, if we restrict x, y, u, v, . . ., w to sufficiently 
small intervals about xo, yo, Uo, Vo, . . ., Wo, respectively, the equation 
d(x, y, U, U,. . ., W) = 0 can be solved in exactly one way for x as a 


1This follows from the covering theorem, p. 109. 
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function of the other variables, and this solution x = X(y, u, v, . . ., w) 
is a continuously differentiable function of its arguments and has the 
partial derivative Xy = — ¢y/¢z. If we substitute this function x = 
X(y, U, v,. . ., Ww) in y(x, y, U, v,. . ., wW), we obtain a function y(x, y, u, 
V,...,W) = x(y, u, Vv, . . ., wW), and 


Hence, in virtue of the assumption that D + 0, we see that the deriva- 
tive yy is not zero. Thus, if wer estrict y, u,v, . . .,w to intervals about 
yo, Uo, Vo, . . . Wo contained in the intervals to which they were pre- 
viously restricted, we can solve the equation x = 0 in exactly one way 
for y as a function of u, y, . . ., w, and this solution is continuously dif- 
ferentiable. Substituting this expression for y in the equation x = 
X(y,u,v,. ..,w), we find x as a function ofu,v,. .., w. This solution is 
unique and continuously differentiable, subject to the restriction of 
X,Y, U, U, . . ., wtosufficiently small intervals about xo, yo, Uo, Vo, . . ., 
Wo, respectively. 


Exercises 3.3f 


1. Which of the following systems of equations may be solved for x, y as 
continuously differentiable functions of the remaining variables near 
the indicated points? 

(a) e? sin u — e” cosv +4 w=0 
x cosh w — u sinh y — v? = cosh 1 
x=ly=0,u=0,0=0,w=1 
(b) u cos x — v sin y + w? = 1 
cos (x+y) +uvu=1, 
x=0,y=r/2,u=1lvu=1w=1 
(c) x2 + y? ++ u?—v=0 
x? — y? + 2u—1=0 
x=yr=u=v=1 
(d) cos x +tsiny=0 
sin x — cos ty = 0, 
x=nry= r/2,t=1. 


g. Alternate Construction of the Inverse Mapping by the Method 
of Successive Approximations 


In the preceeding proof the problem of inverting a mapping was re- 
duced to the one-dimensional case and ultimately to the elementary 
fact that the mappings furnished by continuous monotone functions 
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of a single variable can be inverted. This line of argument has two un- 
desirable features. We are forced to distinguish different cases leading 
to quite different resolutions (say, for ġz = 0 and ¢z = 0), which do not 
correspond to any radical change in the character of the original 
transformation. Moreover, the existence proof is not constructive; 
it does not furnish a practical numerical scheme for inverting map- 
pings. Both of these objectionable features are absent in the method 
of iteration or of successive approximation that follows the pattern of 
the numerical methods given in Volume I (p. 502) for the solution of 
equations for a single unknown quantity. The basic idea is to apply 
successive corrections to an approximate solution, where the cor- 
rections are determined from the linear equations best approximating 
the functional relation in a neighborhood of a point. 
We again consider the equations 


(35a) u = g(x,y), v = y(x, y), 


where ¢ and y are continuously differentiable functions in an open set 
R of the x, y-plane. Let (xo, yo) be a point of R at which the Jacobian 


ps Py 
Wa Wy 


(35b) 


has a value different from zero, and let (uo, vo) be the image of (xo, yo) 
in the mapping (35a). We want to show that for (u, v) sufficiently close 
to (uo, Uo) there exists a uniquely determined value (x, y) near (xo, yo) 
for which u = ¢(x, y) and v = w(x, y). 

To obtain the solution we shall use an iteration scheme identical 
with that for functions of one variable discussed in Volume I (p. 502) 
in a notation appropriate to the two-dimensional case. We introduce 
the vectors U = (u, v), X = (x, y). We can write the mapping (35a) 
concisely in the form 


(35c) U = F(X), 

where F is the nonlinear transformation mapping the vector with com- 
ponents x, y onto the vector with components g(x, y), y(x, y). The dif- 
ferentials dx, dy and du, dv satisfy the linear relations (see p. 49) 
(35d) du = d¢ = ¢z dx + dy dy 


(35e) dv = dy = yz dx + Wy dy. 
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If we combine the differentials into vectors dX = (dx, dy), dU = (du, 
dv), we can write! the relations (84d, e) as 


(35f) dU = F dX, 


where F’ is the square matrix formed from the first derivatives of the 
mapping functions 


px | 


(35g) P = 
Yr Wy 


Obviously the matrix F’ plays the role of the derivative of the vector 
mapping function F. The determinant of F’ is just the Jacobian (85b) 
of the mapping.? Generally we shall write F’ = F'(X) to emphasize the 
dependence of the matrix F’ on the vector X = (x, y). For a linear 
mapping the matrix F’ is constant. 

The “size” of the elements of the matrix F’ limits how much the 
mapping F can magnify distances. Take two points (x, y) and (x + A, 
y + k) such that the whole straight line segment joining them lies in 
the domain of the mapping. By the mean value theorem for functions 
of several variables (p. 67), 


d(x + h, y + k) — d(x, y) = gzh + gyk, 


w(x + h, y + k) — y(x, y) = Wah + Wyk, 


(36) 


where the values of the first derivatives are taken at suitable points of 
the segment joining (x, y) and (x + h, y + k). Let M denote an upper 
bound for the quantities 


loz, Igul |Yzl, lyy| 


taken at all points of the segment joining (x, y) and (x + h, y + k). 
Then, obviously, the distance of the image points can be estimated by 


1Jt is best to interpret (35f) as a relation between three matrices dU, F’, dX, identify- 
ing dX and dU with matrices with two rows and a single column: 


dx _ [du\ . 
ax = (7 av = (i: 
see p. 153. 
2The matrix F’ is often called the Jacobian matrix or the Fréchet derivative of the 
mapping. 


3Generally a different intermediate point has to be used in the first and in the second 
equation. 
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(36a) (glx + h, y + k) — d(x, y))? + (W(x + h, y + k) — w(x, y)? 
Sv(M|h|+|M|k)? + (MA[ + | MR)? 
= 72 M(|h|+|k|) S 2M Vh? + R?. 

Thus, the distance of the image points is at most 2M times that of the 


original ones. Introducing the vector Y = (x + h, y + k) we can write 
(36a) in the form of a Lipschitz condition for the mapping F: 


(36b) |FCY) — F(X)| s 2M|¥ — X|, 


where M is an upper bound for the absolute values of the elements of 
the matrix F’.! In matrix notation equations (36) become 


(36c) F(Y) — F(X) = H(X, Y) (Y — X) 
where the matix H satisfies 


(36d) lim H(X, Y) = F(X). 


We now consider the mapping U = F(X) in a neighborhood 
(37a) |X — Xo]|< 6 


of the point Xo = (xo, yo) in the domain R of F. Let Uo = F(Xo) = 
(uo, vo). For a fixed U we write the equation U = F(X), which is to 
be solved for X, in the form 


(37b) X = G(X), 
where 
(37c) G(X) = X + a(U — F(X)); 


here a stands for an appropriately chosen constant nonsingular ma- 
trix, which has a reciprocal a~!, Equation (37b) is then equivalent to 
a(U — F(X)) = 0, which by multiplication with a yields 

a`ta(U — F(X)) = e(U — F(X)) = U — F(X) = 0, 


where e is the unit matrix. Thus, any solution X of (37b)—that is, any 


‘For mappings F in n dimensions the factor 2 in (36b) is to be replaced by n. 
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fixed point of the mapping G—furnishes a solution of U = F(X). 
We will show that a solution X of (87b) is given by the limit of the 
Xn defined by the recursion formula 


(37d) Xni1 = G(Xn) (n=0,1,2,...), 
provided the matrix G’(X) representing the derivative of the vector 
mapping G is of sufficiently small size. More precisely, we require that 


for all X in the neighborhood (37a) of Xo the largest element of the ma- 
trix G’ 1s less than 1/4 1n absolute value and that 


|G(Xo0) — Xo| < +8. 


First we prove by induction that under the stated assumptions 
the recursion formula (37d) leads only to vectors satisfying (37a). 
In this way, one is sure that the X, lie in the domain of G, so that the 
sequence can be continued indefinitely. We find from (36b) with M = + 
that 


(37e) IGH — GXL} |Y -X| for |X — Xo0]<8, |Y — Xo] <ô. 


Now the inequality (37a) is satisfied trivially for X = Xo. If it holds for 
X = Xn, we find for the vector Xn+ı defined by (37d) that 


| Xn+1 — Xo| S| Xn+1 — Xıl + [X — Xo| = | G(Xn) ~~ G(Xo) | 
+|G@(Ko) — Xo] SF [Xn — Xo] + 55 <5. 


This proves that |Xn — Xo| < 6 for all n. 
In order to see that the Xn converge, we observe that by (37e) 


Xar — Xn] =|G(Ka) — G Kaa) S 5 [Xa — Xml. 
By the same reasoning 


[Xn — Xn1]S5|Xea — Xel, 


| Xn-1 — Xn-2l < | Xn-2 — Xn-3|, 


Developments and Applications of the Differential Calculus 271 


and so on. These inequalities together lead to the estimate 


(37f) Xan — Xal $5, 1X1 — Xol < 


The existence of X = lim Xn follows then by writing X as sum of an 
n— oo 


infinite series 
X = Xo + (Xi — Xo) + (X2 — Ki) +e ° © + (XXn+1 — Xn) + °°°, 


whose convergence is established from (37f) by comparison (see Volume 
I, p. 521) with a convergent geometric series. That X is a solution of 
(37b) follows immediately from (37d) for n — co, using the continutity 
of G(X). 

By its definition (37c) the function G depends continuously not only 
on X but also on the vector U. The Xn obtained successively by the re- 
cursion formula (37d) then also depend continuously on U.! Since the 
geometric series used in the comparison that establishes the conver- 
gence of X = lim X, does not depend on U, it follows that X is a 


n> 
uniform limit of continuous functions of U and, hence, is itself a con- 
tinuous function of U. It is clear, moreover, that |X — Xo| < 5, since 
|Xn — X| < 6 for all n. If there existed a second solution Y with Y = 
G(Y) and |Y — Xo| < 5, we would find from (37e) that 


IY — X|=|G(y) - &X)| <s} Y - X| 


and, hence, that |Y — X| = 0 and Y = X. 

In this way, we establish the existence, uniqueness, and con- 
tinuity of a solution X of the equation U = F(X), for which |X — Xo| 
< ò, provided the vector G defined by (37c) has a derivative G’ with 
elements less than in absolute value for |X — Xo| < 5 and provided 


IG(Xo) — Xo] < 56. 


It is easily seen that these requirements can be satisfied for all U suf- 
ficiently close to Uo by a suitable choice of the matrix a. By (37c), 
G(X) = e — aF’(X), 


1Here we make use of the fact that continuous functions of continuous functions 
are again continuous. 
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where e is the unit matrix. Then, for X = Xo, 
G’(X0) = e — aF’(X0) = O 
if we choose for a the matrix reciprocal to the matrix F’(Xo0): 
a = (F'(Xo))~*. 


(The existence of this reciprocal follows from our basic assump- 
tion that the matrix F’(Xo) has a nonvanishing determinant, that is, 
that the Jacobian of the mapping F does not vanish at the point Xo). 
From the assumed continuity of the first derivatives of the mapping F 
it follows that G’CX) depends continuously on X; hence, the elements 
of G’(X) are arbitrarily small, for instance, less than 4, for suf- 
ficiently small |X — Xo|, say for 


IX — Xo| <4; 


moreover, by (37c), 
|G(Xo) — Xo] = la(U — F(Xo)| =|a(U — Uo) <5 5, 


provided U lies in a sufficiently small neighborhood of Uo. 

This completes the proof for the local existence of a continuous 
inverse for a continuously differentiable mapping with nonvanishing 
Jacobian. The existence and continuity of the first derivatives of the 
inverse mapping follow easily from formulae (86c,d). Let U = F(X), 
where we assume that the Jacobian matrix F’(X) is non-singular. 
Then every V sufficiently close to U is of the form V = F(Y) where 
Y tends to X for V tending to U. Hence, for V sufficiently close to U 
the matrix H(X, Y) also is non-singular. We find then that 


Y — X = (H(X, Y) (V — U) 
= (F(X)! (V — U) + E(X, Y) (V — U) 


where 


lim E(X, Y) = lim E(X, Y) = 0. 
V>U Y-X 
This relation, however, just expresses that the vector X satisfying 
U = F(X) is a differentiable function of the vector U, and that the 
Jacobian matrix of X with respect to U is the reciprocal of the matrix 
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F'(X). The same construction of the inverse by iteration or successive 
approximations obviously can be applied to mappings in any number 
of dimensions. 


Exercises 3.3g 


1. Obtain the iterative approximation (x2, y2) for the inverse transformation 
to 


u = = Cx — 9%), v= y 


by applying (87d) to a neighborhood of X = (1, 1) or U = (0, 1). 
2. Compare the result of the preceding exercise with the Taylor expansions 
of x and y to second order in the neighborhood of u = 1, v = 1. 


h. Dependent Functions 


If the Jacobian D vanishes at a point (xo, yo), no general statement 
can be made about the possibility of solving the equations (38a) in the 
neighborhood of that point. Even if inverse functions do happen to 
exist, they cannot be differentiable, for then the product 


d(u, v) _ d(x, y) 
d(x,y) d(u, v) 


would vanish, while by p. 259 it must be equal to 1. For example, the 
equations 


can be solved uniquely, in the form 
x= Ju, y=y, 


although the Jacobian vanishes at the origin; but the function Yu 
is not differentiable at the origin. 
On the other hand, the equations 


u = x? — y?, = 2xy 


cannot be solved uniquely in the neighborhood of the origin, since the 
two points (x, y) and (— x, — y) of the x, y-plane both correspond to the 
same point of the u, v-plane. 

If the Jacobian vanishes identically, not merely at the single point 
(x, y) but at every point in a whole neighborhood of the point (x, y), 
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then the transformation is called degenerate. In this case, it can be 
shown that the functions 


u = ġ(x,y) and v=y(x,¥) 


are dependent, in the sense that one of them is a function of the other 
one.! We first consider the trivial case in which the equations øz = 0 
and øy = 0 hold everywhere, so that the function ¢(x, y) is a constant. 
We then see that while the point (x, y) ranges over a whole region its 
image, (u, v) always remains on the line u = constant. That is, a re- 
gion is mapped only into a line, instead of on a region, so that there is 
no possibility of a 1-1 mapping of two 2-dimensional regions on one 
another. 

A similar situation arises in the general case in which at least one 
of the derivatives øz or øy does not vanish, but the Jacobian D is still 
zero. We suppose that at a point (xo, yo) of the region under con- 
sideration we have øz + 0. It is then possible to solve the first equation 
for x in the form x = X(u, y) and to write v = y(X(u, y), y) = x(u, y), 
just as on p. 262, for there we made use only of the assumption ¢z # 0. 
In virtue of (34j) and the equation D = 0, however, yy must be identi- 
cally 0 in the region where ¢z + 0; that is, the quantity y = v does not 
depend on y at all and v is a function of u alone. We conclude, then, 
that if the Jacobian of the transformation vanishes identically, a re- 
gion of the x, y-plane is mapped by the transformation on a curve in 
the u, v-plane instead of on a region, for in a certain interval of values 
of u only one value of v corresponds to each value of u. Thus, if the 
Jacobian vanishes identically, the functions are not independent; 
that is, a relation 


F(ọ, v) = y — x(9) = 0 


exists that is satisfied for all systems of values (x, y) in the region. 
Conversely, if there exists a curve in the u, v-plane on which the re- 
gion of the x, y-plane is mapped, then for all points of this region the 
Jacobian D = øsWy — ¢dyWz must vanish identically, since obviously 
the mapping cannot be inverted in a full neighborhood of a point. 
The exceptional case discussed separately at the begining is ob- 
viously included in this general statement. The curve in question is 
then just the curve u = constant, which is a parallel to the v-axis. 
An example of a degenerate transformation is 


1Vanishing of the Jacobian is also equivalent to dependence of the vectors (gz, $y) 
and (wz, Wy) formed by the first derivatives of the mapping functions. 
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E=x+y, n=(xt+y) 


In this transformation all the points of the x, y-plane are mapped on 
the points of the parabola n = &? in the &, n-plane. Inverting the 
transformation is out of the question, for all the points of the line x + y 
= constant are mapped on a single point (€, n). As we can easily verify, 
the value of the Jacobian is 0. The relation between the functions & 
and n, in accordance with the general theorem, is given by the equa- 
tion 


FE, n) = 5 —n = 0. 


Exercises 3.3h 


1. Give an example of a pair of continuously differentiable functions č = 
f(x, y), n = g(x, y) that are independent in one region, and not independ- 
ent in another. 


2. Prove that if § = ax + by+candy=ax+ By + y are dependent, the 
lines § = 0 and 7 = 0 are parallel. 


i. Concluding Remarks 


The generalization of the theory to three or more independent vari- 
ables offers no particular difficulties. The chief difference is that in- 
stead of the two-rowed determinant D we have determinants with 
three or more rows. In the case of transformations with three inde- 
pendent variables 


6 = g(x, y,zZ), n = y(x, y, 2), = x(x, y, 2), 
x= g&, n, 6), y= h, n, 9), = U6, n, 6), 


the Jacobian is given by the equation 


Wa Xz 
D = fae > = | dy Yy Xy 
Gz Wz Xz 
In the same way, for transformations 
Ei = ilxı, X2, . . ., Xn) 


xi = gi(S:, 2, . -» Én) (i = 1, 2, . -~ n) 
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with n independent variables, the Jacobian is 


agi 02 $n 
0x1’ 0x1’ a 0x1 
091 O92 Opn 
d(E1, E2, . . . En) _ | 3x2? 3x2? ` ` | Axe 
d(xı, x2, oe 1, Xn) ° ° ° 
061 ogo $n 
OXn’ OXn’  ? ~OXn 


For more than two independent variables, it is still true that when 
transformations are compounded their Jacobians are multiplied to- 
gether. In symbols, 


aE1, G2, - - -» Gn) _ Ani, N2, . - -, Mn) _ ACE, Se, - - «, Gn) 
d(n, 2, - - ey Nn) d(x1, N2, e e es Xn) d(x1, x2, s . œ) Xn) 


In particular, the Jacobian of the inverse transformation is the recip- 
rocal of the Jacobian of the original transformation. 

The theorems on the resolution and composition of transforma- 
tions, on the inversion of a transformation, and on the dependence of 
transformations remain valid for three and more independent vari- 
ables. The proofs are similar to those for the case n = 2; to avoid un- 
necessary repetition we omit them. The same holds for the construc- 
tion of the inverse mapping by the method of iteration. 

In the preceding section, we saw that the behavior of a general 
transformation in many waysresemblesthat of an affinetransformation 
and that the Jacobian plays the same part as the determinant does in 
the case of affine transformation. The following remark makes this 
even clearer. Since the functions & = g(x, y) and n = w(x, y) are dif- 
ferentiable in the neighborhood of (xo, yo), we can express them in the 
form 


E — Eo = (x — Xo)Px(Xo, yo) + (y — yo)ğy(xo, Yo) 
+ € v(x — xo)? + (y — yo), 


n — No = (x — xXo)Wa(xo, yo) + (yY — Yo)Wrx{Xo, Yo) 
+ 5 v(x — xo)? + (y — yo)? 


where £ and ô tend to zero with 
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v(x — xo)? + (y — yo)?. 


This shows that for sufficiently small values of |x — xo| and |y — yol 
the transformation can be represented approximately by the affine 
transformation 


E = bo + (x — Xo0)bx(xo, yo) + (y — yo)by(Xo, Yo), 
n = No + (x — xo)W2(xo, yo) + (y — Yo)Wx(Xo, yo), 


whose determinant is the Jacobian of the original transformation. 


Exercises 3.31 


1. Evaluate o(E, n, p)/0(x, y, z) for each of the following: 


(a) E = e? cos y cos z 

=e* cos y sin Z 

= e? sin y 

= cos (x + y) + cos (y + z) 


= cos (x + y) + sin (y + 2) 
= sin (x + y) + cos (y + 2) 


(b) 


(c) —§ = cosh x + log y 
= tanh y — sinh z 
= x — y? 

(d) E = x cos y sin zZ 
= x sin y sin z 
= x COS Z 


(e) § =x cosy 
= x% sin y 


Z. 


D SM DIM DIM DIM DS 


2. Define dependence of the functions & = f(x, y, z), n= g(x, y, z), ọp = 
h(x, y, z), in a region. Generalize the results of Section h to this case. 


3. Which of the triples of functions given in Exercise 1 are dependent? 
Give an equation relating the functions of each such triple. 


4. Show that the following three functions are dependent and find a re- 
lation connecting them: 


E=x+y+z 
h = x? + y2 + 2z? 
E = xy + yz + zx. 
5. Inversion in three dimensions is defined by the formulae 


4 


ASS S A 
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(a) Prove that the angle between any two surfaces is unchanged. 


(b) Prove that spheres are transformed either into spheres or into 
planes. 


(c) Find the Jacobian of the transformation. 
3.4 Applications 


a. Elements of the Theory of Surfaces 


For surfaces, as for curves, parametric representation is frequently 
to be preferred to other types of representation. For surfaces, we need 
two parameters instead of one; we denote them by u and v. A para- 
metric representation may be expressed in the form 


(39a) x= (u,v), y= y(u, v), z = x(u, v), 


where ¢, y, and y% are given functions of the parameters u and vand the 
point (u, v) ranges over a given region R in the u, v-plane. The corre- 
sponding point with the three rectangular coordinates (x, y, z) then 
ranges over a set in x, y, z-space. Typically, this set is a surface, which 
can be represented in explicit form z = f(x, y), for we may be able to 
solve two of our three equations for u and v in terms of the two cor- 
responding rectangular coordinates. If we then substitute the expres- 
sions found for u and v in the third equation, we obtain an unsymmet- 
rical representation of the surface z = f(x, y). Hence in order to en- 
sure that the equations really do represent a surface, we have only to 
assume that the three Jacobians 


f 
| 


| Wu W| IXu Xw 


du $o 
| Xu Xv |’ bu Py 


(39b) 
Vu Wo | 


> 


do not all vanish at once; in a single formula, we require that 


(39c) (duWy — dou)? + (YuXv — WoXu)? + (Xubv — Xvbu)? > 0. 


Then in some neighborhood of each point in space represented by 
(39a) it is certainly possible to express one of the three coordinates in 
terms of the other two. 

It is advantageous to replace the three equations (39a) in the para- 
metric representation (39a) by a single vector equation 


1This is actually a special case of the parametric form, as we see by putting x = u 
and y = v. 
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(40a) X = (u, v), 


where X = (x, y, z) is the position vector of a point on the surface, and 
® denotes the vector 


D(u, v) = ($u, v), y(u, v), x(u, v)). 


At each point with parameters u, v on the surface, we can form the 
partial derivatives of the position vector 


(40b) Xu = (bu, Wu, Xu) and X, = (Øv, Wo, Xv). 
The total differential of the vector X is then [cf. formula (15b), p.49] 
(40c) dX = (dx, dy, dz) = Xu du + X, dv. 


The three determinants (39b) are just the components of the vector 
product X, x X, of the vectors X, and X,(see p. 000). The expression 
on the left in (39c) represents the square of the length of the vector 
X, X X,, so that condition (39c) is equivalent to 


For example, the spherical surface x? + y2? + z? = r? of radiusr 
is represented parametrically by the equations 


(40e) x=rcosusinv, y=r sinusinv, Z2=Prcosv 


(O<u<2n, Svín) 


where v = 9 is the “polar inclination” and u = ø is the “longitude” 
of the point on the sphere (cf. p. 250). 

This example exhibits one of the advantages of parametric repre- 
sentation. The three coordinates are given explictly as functions of 
u and v, and these functions are single-valued. If v runs from 7/2 to t, 
we obtain the lower hemisphere, that is, 


2=— fy, 
while values of v from 0 to z/2 give the upper hemisphere. Thus, for the 


parametric representation it is not necessary, as it is for the represen- 
tation 


z= +4 yr — x — y2, 
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to consider two single-valued branches of the function in order to ob- 
tain the whole sphere. | 

We obtain another parametric representation of the sphere by 
means of stereographic projection (see Volume I, p. 21). In order to 
project the sphere x? + y2? + 2? — r? = 0 stereographically from the 
north pole (0, 0, r) on the equatorial plane z = 0, we join each point of 
the surface to the north pole N by a straight line and call the intersec- 
tion of this line with the equatorial plane the stereographic image 
of the corresponding point of the sphere (Fig. 3.12) We thus obtain a 
1-1 correspondence between the points of the sphere and the points 
of the plane, except for the north pole N. Using elementary geometry, 
we readily find that this correspondence is expressed by the formulae 


2 2 24 y2— p2 
(408) x= 2r2u y= 2r2v _ (u? + v? -— rr 


= z= 
u+? 7 ue + V2 + 2? u? + v2 + 2?’ 


where (u, v) are the rectangular coordinates of the image-point in the 
plane. These equations may be regarded as a parametric representa- 
tion of the sphere, the parameters u and v being rectangular coordi- 
nates in the u, v-plane. 


Figure 3.12 Stereographic projection of the sphere 


As a further example, we give parametric representations of the 
surfaces . 


a2 Boast and atp ae 


which are called the hyperboloid of one sheet and the hyperboloid of 
two sheets respectively (cf. Figs. 3.13 and 3.14). The hyperboloid of one 
sheet is represented by 
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Figure 3.13 Hyperboloid of one Figure 3.14 Hyperboloid of two 
sheet. sheets. 


x = a cos u cosh v, 
(40g) = b sin u cosh v, 
z =c sinh v 
(O<u<2r, ~œ <u< + œ) 
and the hyperboloid of two sheets by 
x = a cos u sinh v, 
(40h) y = 6 sin u sinh v, 
z= +c cosh v 
O<u<2r, 0<v< +o). 


In general, we may regard the parametric representation of a surface 
as the mapping of the region R of the u, v-plane onto the corresponding 
surface. To each point of the region R of the u, v-plane there corre- 
sponds one point of the surface, and typically the converse is also true. 

In the same way, a curve u = u(t), v = v(t) in the u, v-plane corre- 
sponds by virtue of the equations 


x = gult), u(t) = x(t),... 


1This, of course, is not always the case. For example, in the representation (40e) of 
the sphere by spherical coordinates (p. 279) the poles of the sphere correspond to 
the whole line segments given by v = 0 and v= q. 
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to a curve on the surfaee. In particular, in the representation (40e) of 
the sphere by means of spherical coordinates the meridians are repre- 
sented by the equation u = constant and the parallels of latitude by 
v = constant. Generally, we may consider those curves on a surface 
that are given by equations u = constant or v = constant. If in our 
parametric representation we substitute a definite fixed value for u, 
we obtain a “space curve” or “twisted curve” lying on the surface 
and having v as parameter, and a corresponding statement holds good 
if we substitute a fixed value for v and allow u to vary. These curves 

= constant and v = constant are the parametric curves or coordi- 
nate lines on the surface. The net of parametric curves corresponds to 
the net of parallels to the axes in the u,v-plane (Fig. 3.15). 


Figure 3.15 Parametric curves 
u = constant, v = constant. 


The tangent to the curve on the surface corresponding to the curve 
u = u(t), v = v(t) in the u,v-plane has the direction of the vector 


du du du du du du 
(41) Xe = (xt, Yt, zi) = [eu 5 + Xo dt’? Y” dt + Yot? *" dt t Zo A 


du dv 
dt + Xo t 


= X, 
(see p. 212). At a given point of the surface the tangential vectors X; 
of all curves on the surface passing through that point are dependent 
on the two vectors Xu, X», which respectively are tangential to the 
parametric lines v = constant and u = constant passing through 
that point. This means that the tangents all lie in the plane through 
the point spanned by the vectors Xu and X», the tangent plane to the 
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surface at that point. The normal to the surface is perpendicular to all 
tangential directions, in particular to the vectors Xx and X». It follows 
(see. p. 182) that the surface normal is parallel to the direction of the 
vector product 


(42) Xu X Xy = (yu2 — YvZu, ZuXy — SyXu, XuVy — XvYu). 


One of the most important tools for investigation of the properties 
of a given surface is the study of the curves that lie on it. Here we shall 
only give the expression for s, the length of arc of such a curve. As 
mentioned on p. 213, (see also Volume I, p. 353) 


EERE 
(a: -|g tae} tla) = Xe Xe 
so that in view of the equations (41) we obtain 


0D (a) = Pe ge + Xe Gu) + ee + Xe 


[eu + zF) tg + ye) + (eu + ase) 
= E(u) +2 at + Ola) 


Here the coefficients E, F, G, the Gaussian fundamental quantities of 
the surface, are given by 


wy n= oI 


_ 0x dx y dy dz az 


(44b) F= 5 aD au aut Ju Jy 7 ot Xe 
=o oy 
(44c) G= (2 + aD + aD = X, °. X.. 


These depend only on the surface itself and its parametric representa- 
tion and not on the particular choice of the curve on the surface. The 
expression (43) for the derivative of the length of arc s with respect 
to the parameter t usually is written symbolically without reference 
to the parameter used along the curve. One says that the line element 
ds is given by the quadratic differential form (“fundamental form”) 


(45) ds? = E du? + 2F du dv + G adv’. 
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The length of the cross product Xx X, can be expressed in 
terms of E, F, G since (see p. 182) 


(45a) [Xu X Xo|? =] Xu/?2| Xo]? — (Xu ° Xo)? = EG — F?. 


Our original assumption (39c) or (40d) on the parametric representa- 
tion can thus be formulated as the condition 


(46) EG — F?>0 


for the fundamental quantities. 
The direction cosines for one of the two normals to the surface are 
the components of the unit vector 


1 


X, xX Xo= JEG fe Bu X Xo. 


1 
|Xu x Xo| 


It follows from (42) that the normal for a surface represented parame- 
trically has the direction cosines 


_ Yuzo — Yreu _ ZuXlv — SvXu L XuVv — Xyu 
(47) cos a = JEG FF > CSB = ypa Fe > SY = JEG FE 


The tangent to a curve u = u(t), v = v(t) on the surface has the di- 
rection of the vector 


du du 
X: = Xu a + Koa. 


If we now consider a second curve u = u(t), v = v(t) on the surface 
referred to a parameter t, its tangent has the direction of the vector 


X. = X, 


If the two curves pass through the same point on the surface, the co- 
sine of the angle of intersection œ is the same as the cosine of the 
angle between the vectors X; and X.. Hence (see p. 131), 


coso = et 
| Xe] | Xe] 


Here 
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du du du du du du du du 
= Eo Get (Ge ast ae at) + Sap de 


Consequently the cosine of the angle between the two curves on the 
surface is given by 


(48) cos@ 
du du dudv du du dv du 
_ ER dt laht di) to dd 
du\? dudv (du d du)? du dv (2) 
JE(Z) + RG atO la) VEE) + 2F Ge get O(a) - 
The mapping of one plane region on another may be regarded as a 
special case of parametric representation, for if the third of our func- 
tions x(u, v) in (39a) vanishes for all values of u and v under considera- 
tion, our equations merely represent the mapping of a region of the 
u, v-plane on a region of the x, y-plane; or if we prefer to think in 
terms of transformations of coordinates, the equations define a system 
of curvilinear coordinates in the u, v-region, and the inverse functions 
(if they exist) define a curvilinear u, v-system of coordinates in the 
plane x, y-region. In terms of the curvilinear coordinates (u, v) the line 
element in the x, y-plane is simply [see (44a, b, c)] 


ds? = E du? + 2F du du + G dv’, 


where 
on a= (+ GE 
Ao o- Os By 


As a further example of the representation of a surface in parame- 
tric form we consider the anchor ring, or torus. This is obtained by ro- 
tating a circle about a line which lies in the plane of the circle and 
does not intersect it (cf. Fig. 3.16). We take the axis of rotation as the 
z-axis and choose the y-axis in such a way that it passes through the 
center of the circle, whose y-coordinate we denote by a. If the radius 
of the circle is r<|a|, we obtain 
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Z 


Figure 3.16 Generation of a torus 
by the rotation of a circle. 


x= 0, y— a= r cos 9, z = r sin (0 < 8 < 27) 


as a parametric representation of the circle in the y,z-plane. Now 
letting the circle rotate about the z-axis, we find that for each point 
of the circle x? + y? remains constant; that is, x? + y? = (a + r cos 9)?. 
If ¢ is the angle of rotation about the z-axis, we have 


x = (a + r cos 8) sin ¢, 
y = (a + r cos 89) cos ¢, 
z=rsin90 
(0<¢<2n, 0S 0 < Qn) 


as a parametric representation of the torus in terms of the parameters 
0 and ø. In this representation the torus appears as the image of 
a square of side 2r in the 9, ¢-plane, where any pair of boundary points 
lying on the same line 9 = constant or ¢ = constant corresponds to 
only one point on the surface, and the four corners of the square all 
correspond to the same point. 

For the line element on the anchor ring, we have by (44a, b, c), (45) 


ds? = r? d02 + (a + r cos 8)?d¢?. 


Exercises 3.4a 


1. Calculate the line element 
(a) on the sphere 


= cos u sin U, y = sin u sin U, z = COS V; 
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(b) on the hyperboloid 
x = cos u cosh v, y = sin u cosh v, z = sinh v; 
(c) on a surface of revolution given by 
r= Vx? + y? = f(z), 

using the cylindrical coordinates z and 0 = arc tan (y/x) as coordi- 

nates on the surface; 
(d) on the quadric ts = constant of the family of confocal quadrics given 

y 


y2 22 


ito 


+ 


at 5- =1, 


using tı and tz as coordinates on the quadric (cf. Exercise 9, p. 256). 
. Find the Gauss fundamental quantities for the catenoid x = a cosh (t/a) 
cos (6/a), y = a cosh (t/a) sin (6/a), z = t; show that E — G= F=0. 

. For the surface x = u cos v, y = u sin v, z = «u + 8B, «, B = constant, 
show that the images of the lines u = constant, v = constant are 
orthogonal. 

. What is the fundamental form giving the line element for a surface given 
by an equation z = f(x, y)? 

. Prove that if a new system of curvilinear coordinates r, s is introduced 
on a surface with parameters u, v by means of the equations 


u = u(r, $), v = u(r, s), 
then 


d(u, Tat 
d(r,s)} ’ 

where E’, F”, Œ denote the fundamental quantities taken with respect to 
r,s and E, F, G those taken with respect to u, v 

. Let t be a tangent to a surface S at the point P, and consider the sections 
of S made by all planes containing t. Prove that the centers of curvature 
of the different sections lie on a circle. 

. If fis a tangent to the surface S at the point P, we call the curvature of 
the normal plane section through f (i.e., the section through t and the 
normal) at that point the curvature k of S in the direction t. For every 
tangent at P we take the vector with the direction of t, initial point P, 

and length 1/vk. Prove that the final points of these vectors lie ona 
conic. 

. A curve is given as the intersection of the two surfaces 


x? + y2+ 22=1 
ax? + by? + cz? =0 


E’G@’ — F” = (EG — P| 


Find the equations of 
(a) the tangent, 
(b) the osculating plane, at any point of the curve. 
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9. If the coordinates (x, y, 2) of a point on a sphere are given by the equa- 
tions (cf. p. 250) 


x = asin 9 cos ¢, y= a sin 9 sin ¢, z = a cos 9, 


show that the two curves of the systems 9 + ¢ = a, 0 — ¢ = B, which 
pass through any point (0, ¢), cut one another at the angle arc cos 
{(1 — sin?@)/(1 + sin? 9)} (cf. p. 285). 

Show that the radius of curvature of either curve is equal to 


a(l + sin? 6)3/2 
(5 + 3 sin? 0)!/2° 


b. Conformal Transformation in General 


A transformation in the plane 


is called conformal if it maps any two intersecting curves into two 
others enclosing the same angle as the original ones. 


THEOREM. A necessary and sufficient condition that a con- 
tinuously differentiable transformation (50) should be conformal is that 
the Cauchy-Riemann equations 


(51a) Pu — Wr, = 0, Øv +wy,=0 
or 
(51b) Pu + Wy = 0, Py — Wy = 0 


hold. In the first case the direction of the angles is preserved, in the sec- 
ond case the direction is reversed.} | 

The proof of this follows: If the transformation is conformal, the 
two orthogonal curves u = constant = uo, v = vo + tandu = uo + T, 
v = constant = vo in the u,v-plane must map into orthogonal curves 
in the x, y-plane. From the formula (48) for the angle between two 
curves (p. 285) is follows immediately that 


In the same way, the curves corresponding to the lines u = uo + t, 
u = Vo + tand u = uo + T, v = Uo — T must be orthogonal. This gives 


1This last statement follows directly from the statements on p. 260 concerning the 
sign of the Jacobian D = ĝu Yo — $v Wu. In case (51a) holds, we have D = gu? + ġe? 
= 0, in case (51b) D = — ¢u? — gp? SO. 
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(51d) 0 = E — G = $y? + Wu? — ov? — Wo". 
Equation (51c) can be written as 
bu = Yo, gv = —AWu, 


where à denotes a constant of proportionality. Introducing this into 

equation (51d), we immediately get à? = 1, so that one or the other of 

our two systems of Cauchy-Riemann equations (51a, b) holds. 
That the Cauchy-Riemann equations are a sufficient condition for 

conformality except at points where all four of the quantities ¢,,¢», 

Wu, W» are zero is confirmed by the following observations. 
Equations (51a) or (51b) yield relations 


KE=G20, F=0 


for the fundamental quantities E, F, G, defined by (49a, b, c). By (48) 
the angle œ between two curves in the x, y-plane is then given by 


du du dv dv 


dt dt ' dt dt 


Vla +a) Vz) + 


cos ® = | do j . 
dt 

The right side of this equation is just the cosine of the angle between 
the corresponding curves in the u, u-plane. Thus, the mapping pre- 
serves angles between curves, possibly changing their orientation. 
The only exception is presented by points where E = F = G = 0, 


that is, by points where all first derivatives of both mapping functions 
vanish.} 


Exercises 3.4b 


1. Investigate the behavior of the mapping x = u? — v2, y = 2uu. Is it con- 
formal at u = 2, v = 3? At u = v = 0? Why? 

2. Where is the mapping x = } log (u? + v2), y = arc tan v/u, conformal? 
3. Show that if the mappings (u, v) — (x, y) and (u, v) > (č, n) are both 
conformal, the mapping (u, v) > (x— — yn, x7 + yE) is also conformal. 

4. (a) Prove that the stereographic projection of the unit sphere on the 

plane is conformal. 
(b) Prove that circles on the sphere are transformed either into circles 
or into straight lines in the plane. 


1There the mapping may actually cease to be conformal. 
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(c) Prove that in stereographic projection reflection of the spherical 
surface in the equatorial plane corresponds to an inversion in the 
u, v-plane. 

(d) Find the expression for the line element on the sphere in terms of the 
parameters U, VU. : 


5. Under what conditions on the Gaussian fundamental coefficients (44) 
will the mapping from the u, v-plane to the surface X = X (u, v) be 
conformal? 


6. Find a conformal mapping of the sphere x = cos 0 sing, y = sin 9 sing, 
z = cos ¢ into the u, v-plane such that 0 = u, and ¢ = f(v) with f(0) = 3 x. 


3.5 Families of Curves, Families of Surfaces, and Their 
Envelopes 


a. General Remarks 


On various occasions we have already considered curves or sur- 
faces not as individual configurations but as members of a family of 
curves or surfaces, such as f(x, y) = c, where to each value of c there 
corresponds a different curve of the family. 

For example, the lines parallel to the y-axis in the x, y-plane, that is, 
the lines x = c, form a family of curves. The same is true for the family 
of concentric circles x? + y? = c? about the origin; to each value of 
c there corresponds a circle of the family, namely, the circle with ra- 
dius c. Similarly, the rectangular hyperbolas xy = c form a family of 
curves, sketched in Fig. 3.2. The particular value c = 0 corresponds 
to the degenerate hyperbola consisting of the two coordinate axes. 
Another example of a family of curves is the set of all the normals 
to a given curve. If the curve is given in terms of the parameter t by the 
equations € = g(t), n = y(t), we obtain the equation of the family of 
normals in the form (see Volume I, p. 345) 


(x — SDPO + (y — wv’ = 0, 


where ¢ is used instead of c to denote the parameter of the family. 
The general concept of a family of curves can be expressed analyt- 
ically in the following way. Let 


f(x, y, ©) 


be a continuously differentiable function of the two independent 
variables x and y and of the parameter c, where the parameter varies 
in a given interval. (Thus, the parameter is really a third independent 
variable, which is lettered differently simply because it plays a dif- 
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ferent part.) Then, if for each value of the parameter c the equation 


(52a) f(x, y,c) = 0 


represents a curve, the aggregate of the curves obtained as c describes 
its interval is called a family of curves depending on the parameter c. 

Each curve of such a family may also be represented in parametric 
form 


(52b) x = ġ(t, c), y= y(t, c), 


where c is the parameter distinguishing the different curves of the 
family and ¢ the parameter along the curve. 
For example, the equations 


x = c cost, y=csint 


represent the family of concentric circles mentioned above; again the 
equations 


represent the family of rectangular hyperbolas mentioned above, ex- 
cept for the degenerate hyperbola consisting of the coordinate axes. 

Occasionally we are led to consider families of curves that depend 
on several parameters. For example, the aggregate of all circles 
(x — a)? + (y — b}? = c? in the plane is a family of curves depending on 
the three parameters a, b, c. If nothing is said to the contrary, we shall 
always understand a family of curves to be a “one-parameter” family, 
depending on a single parameter. The other cases we shall distinguish 
by speaking of two-parameter, three-parameter, or multiparameter 
families of curves. 

Similar statements of course hold for families of surfaces in space. 
If we are given a continuously differentiable function f(x, y, z, c) and 
if for each value of the parameter c in a certain definite interval the 
equation 


f(x, y, z,c) = 0 


represents a surface in the space with rectangular coordinates x, y, z, 
then the aggregate of the surfaces obtained by letting c describe its 
interval is called a family of surfaces, or, more precisely, a one-para- 
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meter family of surfaces with the parameter c. For example, the spheres 
x? + y2 + z2 = c? about the origin form such a family. As with curves, 
we can also consider families of surfaces depending on several para- 
meters. 

Thus, the planes defined by the equation 


ax + by + V1—a2— $622+1=0 


form a two-parameter family depending on the parameters a and b 
if the parameters a and b range over the region a? + b? < 1. This 
family of surfaces consists of the class of all planes that are at unit 
distance from the origin.! 


Exercises 3.5a 
1. Characterize the following families of curves geometrically: 
(a) a + B2 = c?, a, b = known constants, c = a parameter 


(b) x? ++ (y — c} =c?, c= parameter 
(c) x = cos (c + t), y= sin (c+ t) 0 <t <2r, c= parameter. 
2. Describe the one-parameter family of surfaces 


(x — c)? + (y — 1 — c} + (z + V2 — 2c)? = 1. 


b. Envelopes of One-Parameter Families of Curves 


If a family of straight lines consists of the tangents to a plane curve 
E (e.g., if the family of normals of a curve C is the family of tangents to 
the evolute E of C; cf. Volume I, p. 424,) we shall say that the curve E 
is the envelope of the family of lines. In the same way, we shall say that 
the family of circles with radius 1 and center on the x-axis—that is, 
the family of circles with the equation (x — c)? + y? — 1 = 0—has as 
its envelope the pair of lines y = 1 and y = — 1, which touch each of 
the circles (Fig. 3.17). In both examples, we can obtain the point of con- 
tact of the envelope and a curve of the family with parameter value c 
by finding the intersections of the two curves of the family with para- 
meter values c and c + h and then letting h tend to 0. We express this 
briefly by saying that the envelope is the locus of the intersections of 
neighbouring curves. 

For any family of curves a curve E that at each of its points touches 


1Sometimes a one-parametric family of surfaces is referred to as co! surfaces, a two- 
parametric family as co? surfaces, and so on. 
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Figure 3.17 Family of circles with envelope. 


some one of the curves of the family is called the envelope of the family 
of curves. The question now arises of finding the envelope £ of a given 
family of curves f(x, y, c) = 0. We first make a few plausible remarks 
in which we assume that an envelope E does exist and that it can be 
obtained, as in the above cases, as the locus of the intersections of 
neighboring curves.! We then obtain the point of contact of the curve 
f(x, y, c) = 0 with the curve E in the following way: In addition to this 
curve we consider a neighboring curve f(x, y, c + h) = 0, find the in- 
tersection of these two curves, and then let h tend to 0. The point of 
intersection must then approach the point of contact sought. At the 
point of intersection the equation 


f(x, y, c + m — f(x, y, ©) — 0 


is true as well as the equations f(x, y, c + h) = 0 and f(x, y, c&) = 0. 
In the first equation, we pass to the limit h > 0. Since we assume the 
existence of the partial derivative fe, this gives the two equations 


(53) f(x, y,c)=0, fel(x,y¥,c) = 0 


for the point of contact of the curve f(x, y, c) = 0 with the envelope. 
If we can determine x and y as functions of c by means of these equa- 
tions, we obtain the parametric representation of a curve with the 
parameter c, and this curve is the envelope. By elimination of the 
parameter c, the curve can also be represented in the form g(x, y) = 0. 
This equation is called the discriminant of the family, and the curve 
given by the equation g(x, y) = 0 is called the discriminant curve. 


Since this last assumption will be shown by examples to be too restrictive, we shall 
shortly replace these plausibilities by a more complete discussion. 
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We are thus led to the following rule: In order to obtain the en- 
velope of a family of curves f(x, y, c) = 0, we consider the two equations 
f(x, y, c) = 0 and f(x, y, c) = 0 simultaneously and attempt to express 
x and y as functions of c by means of them or to eliminate the quantity 
c between them. 

We now replace these heuristic considerations by a more general 
discussion based on the definition of the envelope as the curve of con- 
tact. At the same time, we shall learn under what conditions our rule 
actually does give the envelope and what other possibilities present 
themselves. 

=- To begin with, we assume that E is an envelope that can be repre- 
sented in terms of the parameter c by two continuously differentiable 
functions 


= x(c), y = y(c), 


where 


dx\? (dy\? 
F + A + 0, 

and that E at the point with parameter c touches the curve of the 
family f(x, y, c) = 0 with the same value of the parameter c. The equa- 
tion f(x, y, c) = 0 is then satisfied at the point of contact. Consequent- 
ly, if we substitute the expressions x(c) and y(c) for x and y in this equa- 
tion, it remains valid for all values of c in the interval. On differentiat- 
ing with respect to c, we at once obtain 


dx - dy. - _ 
fz Fe + fuge t fe = 0. 
Now the condition of tangency is 
dx |, dy _ 
fe Je + fy ge = % 


for the quantities dx/dc and dy/dc are proportional to the direction 
cosines of the tangent to E and the quantities fz and fy are proportional 
to the direction cosines of the normal to the curve f(x, y, c) = 0 of the 
family, and these directions must be at right angles to one another. 
It follows that the envelope satisfies the equation fe = 0, and we thus 
see that equations (53) form a necessary condition for the envelope. 

In order to find out how far this condition is also sufficient, we as- 
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sume that a curve E represented by two continuously differentiable 
functions x = x(c) and y = y(c) satisfies the two equations f(x, y, c) = 0 
and fe(x, y, c) = 0. In f(x, y, c) = 0 we again substitute x(c) and y(c) 
for x and y; this equation then becomes an identity in c. If we differ- 
entiate with respect to c and remember that fe = 0, we at once obtain 
the relation 


dx - dy _ 
fz de + Iu ge = 9 


which therefore holds for all points of E. Ifthe two expressions fz? + fy? 
and (dx/dc)? + (dy/dc)? both differ from 0 at a point of E, so that at 
that point both the curve E and the curve of the family have well- 
defined tangents, this equation states that the envelope and the curve 
of the family touch one another. With these additional assumptions 
our rule is a sufficient condition for the envelope as well as a necessary 
one. If, however, fz and fy both vanish, the curve of the family may 
have a singular point (cf. p. 236), and we can draw no conclusions 
about the contact of the curves. 

Thus, after we have found the discriminant curve, it is still neces- 
sary to make a further investigation in each case, in order to discover 
whether it is really an envelope or to what extent it fails to be one. 

In conclusion, we state the condition for the discriminant curve of a 
family of curves given in parametric form 


x= A(t, c), y = y(t, c), 


with the curve parameter t. This is 


OtWe — eyi = 0. 


We can readily obtain this condition by passing from the parametric 
representation of the family to the original expression by elimination 
of t. 


Exercises 3.5b 


1. Do the normals to a smooth plane curve always have an envelope? 
2. The straight lines 
y = cx + $o) 
satisfy the differential equation 


y= xy + Wy’) 
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(Clairaut equation). Obtain a nonparametric equation for the envelope 
of the family and verify that it, too, must satisfy the differential equation. 


c. Examples 


1. (x — c}? + y2? = 1. As we remarked on p. 292, this equation rep- 
resents the family of circles of unit radius whose centers lie on the 
x-axis (Fig. 3.17). Geometrically, we see at once that the envelope must 
consist of the two lines y = 1 and y = — 1. We can verify this by means 
of our rule; for the two equations (x — c)? + y? = 1 and — 2(x — c) = 0 
immediately give us the envelope in the form y? = 1. 

2. The family of circles of unit radius passing through the origin, 
whose centers, therefore, must lie on the circle of unit radius about 
the origin, is given by the equation 


(x — cos c)? + (y — sinc)? = 
or 
x2 + y2 — 2x cos c — 2y sinc = 0. 


The derivative with respect to c equated to 0 gives xsinc — y cosc = 0. 
These two equations are satisfied by the values x = 0 and y = 0. If, 
however, x? + y? # 0, it readily follows from our equations that sin c 
= y/2, cos c = x/2, so that on eliminating c we obtain x? + y? = 4. 
Thus, for the envelope our rule gives us the circle of radius 2 about the 
origin, as is anticipated by geometrical intuition; but it also gives us 
the isolated point x = 0, y = 0. 

3. The family of parabolas (x — c)? — 2y = 0 (cf. Fig. 3.18) also has 
an envelope, which both by intuition and by our rule is found to be the 
x-axis. 


Xd 


Cı Co Cs C4 C5 


Figure 3.18 Family of parabolas with envelope. 


Developments and Applications of the Differential Calculus 297 


4. We consider the family of circles (x — 2c)? + y2? — œe = 0 (cf. 
Fig. 3.19). Differentiation with respect to c gives 2x — 3c = 0, and by 
substitution we find that the equation of the envelope is 


x2 
2”. 
y= 3 > 


that is, the envelope consists of the two lines 


_ 1 a yak, 
Y= Jz * an y= 73 * 


The origin is an exception in that contact does not occur there. 


Figure 3.19 The family (x — 2c)? + y? — œ? = 0. 


5. We next consider the family of straight lines on which unit 
length is cut out by the x- and y-axes. If a = c is the angle indicated 
in Fig. 3.20, the lines are given by the equation 


x y -1 
cosa sina ° 


The condition for the envelope is 


sin a cos a 
2a ~~ sin? 
cos’a sin?a 


y=), 


which, in conjunction with the equation of the lines, gives the 
envelope in parametric form, 


x = cos? a, y= sina. 
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x 


Figure 3.20 Arc of the astroid as envelope of straight lines. 
Kliminating the parameter, we obtain the equation 


This curve is called the astroid (cf. Volume I, Chapter 4, Exercise 1, 
p. 435). It consists (Figs. 3.21 and 3.22) of four symmetrical branches 
meeting in four cusps. 


AN. 
V 


Figure 3.21 Astroid. Figure 3.22 Astroid as envelope of ellipses. 


6. The astroid x?’ + y?/8 = 1 also appears as the envelope of the 
family of ellipses 
x? X 
cat (l1—c)? — 1 
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whose semiaxes c and (1 — c) have the constant sum 1 (Fig. 3.22). 
7. The family of curves (x — c)? — y3 = 0 shows that in certain cir- 
cumstances our process may fail to give an envelope. Here the rule 
gives the x-axis. But, as Fig. 3.23 shows, this is not an envelope; 
it is the locus of the cusps of the curves of the family. 
8. For the family 


LYN C3 Cy Cs 
Figure 3.23 The family (x — c)? — y? = 0. 
(x — 68 — y® =0, 


the discriminant curve is the x-axis (cf. Fig. 3.24). This 1s again the 
cusp-locus; but it touches each of the curves, and in this sense must 
be regarded as the envelope. 


C~3 \C~2 NX Cy Co 


Figure 3.24 The family (x — c} — y? = 0. 
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9. The family of strophoids 
[x2 + (y — o] (x —2) + x=0 


(cf. Fig. 3.25) has a discriminant curve consisting of the envelope plus 
the locus of the double points. The curves of the family are congruent 
to each other and arise from one another by translation parallel to 
the y-axis. By differentiation we obtain 


fe = —Xy — c)(x — 2) = 0, 


so that we must have either x = 2or y = c. The line x = 2 does not en- 
ter into the matter, however, for no finite value of y corresponds to 
x = 2. We therefore have y = c. So that the discriminant curve is 


x(x —-2)+x=0. 


This curve consists of the straight lines x = 0 and x = 1. As we see in 
Fig. 3.25, only x = 0is the envelope; the line x = 1 passes through the 
double points of the curves. 


Figure 3.25 Family of strophoids. 


10. The envelope need not be the locus of the points of intersection 
of neighbouring curves; that is shown by the family of identical paral- 
lel cubical parabolas y — (x — c)? = 0. No two of these curves inter- 
sect each other. The rule gives the equation fe = 3(x — c)? = 0, so that 
the x-axis y = 0 is the discriminant curve. Since all the curves of the 
family are touched by it, it is also the envelope (Fig. 3.26). 
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Figure 3.26 Family of cubical parabolas. 


11. The notion of the envelope enables us to give a new definition 
for the evolute of a curve C (cf. Volume I, pp. 359, 424 ff.). Let C be given 
by 

x= $t), y= wd). 


We define the evolute E of C as the envelope of the normals of C. Since 
the normals of C are given by 


{x — dD} p'O + {fy — vO} wv’ = 0, 


the envelope is found by differentiating this equation with respect to 
t: 


0 = {x — Np A + fy — vO} Ww" — pA — yA). 


From this equation and the preceding one, we obtain the parametric 
representation of the envelope, 


; p° +y? w’p 
x= t) — WO 7 an = 9 - a? 
p? + y’? o'p 


Y= WO + IO yg py T Y t g we? 


where 


(g? + y’2)3/2 
p — yg — pw’ 
denotes the radius of curvature (cf. Volume I, p. 358). These equations 
are identical with those given in Volume I (p. 359) for the evolute. 
12. Let a curve C be given by x = g(t), y = w(t). We form the en- 
velope E of the circles having their centers on C and passing through 
the origin O. Since the circles are given by 


x? + y? — 2xg(t) — 2yy(t) = 0, 


the equation of E is 
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x9'(t) + yw'(t) = 0. 


Hence, if P is the point (g(t), y(t)) and Q(x, y) is the corresponding point 
of E, then OQ is perpendicular to the tangent to C at P. Since by defi- 
nition PQ = PO, PO and PQ make equal angles with the tangent 
to C at P. 

If we imagine O to be a luminous point and C a reflecting curve, 
then QP is the reflected ray corresponding to OP. The envelope of the 
reflected rays is called the caustic of C with respect to O. The caustic 
is the evolute of E: the reflected ray PQ is normal to E, since a circle 
with center P touches E at Q, and the envelope of the normals of E 
is its evolute, as we saw in the preceding example. 

For example, let C be a circle passing through O. Then F is the path 
described by the point O’ of a circle C’ congruent to C that rolls on C 
and starts with O and O’ coincident, for during the motion O and O’ 
always occupy symmetrical positions with respect to the common 
tangent of the two circles. Thus, Æ will be a special epicycloid, in fact, 
a cardioid (cf. Volume I, p. 329 ff.). As the evolute of an epicycloid is a 
similar epicycloid (cf. Volume I, p. 489), the caustic of C with respect to 
O is in this case a cardioid. 


Exercises 3.5c 


1. A projectile fired from the origin at initial angle of inclination « and 
fixed initial speed v travels in a parabolic trajectory given by the 
equations 


x = (v cos «) t 
y= (vsin a) t— 5 gË, 


where g is the constant acceleration of gravity. 
(a) Find the envelope of the family of trajectories with parameter «. 
(b) Show that no point above the envelope can be hit by the projectile. 
(c) Show that every point below the envelope can be hit in two ways, 
that is, that such a point lies on two trajectories. 
2. Obtain the envelopes of the following families of curves: 


(a) y = cx + 1/c. 

(b) y? = efx — c) 

(c) ex? + y2/e =1 

(d) (x — c} + y? = a®c?/(1 + a°), a = constant. 


3. Let C be an arbitrary curve in the plane, and consider the circles of 
radius p whose centers lie on C. Prove that the envelope of these circles 
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is formed by the two curves parallel to C at the distance p (cf. the defi- 
nition of parallel curves, Volume I, p. 291). 


4. A family of straight lines in space may be given as the intersection of 
two planes depending on a parameter t: 


a(t)x + b(t)y + c(Hz=1 
d(t)x + e(t)y + f(z = 1. 


Prove that if these straight lines are tangents to some curve, (i.e., 
possess an envelope), then 


a—d b—e c—f 


d’ e f’ 


5. If a plane curve C is given by x = f(t), y = g(t), its polar reciprocal 
C’ is defined as the envelope of the family of straight lines 


Ef) + ngt) = 1, 
where (&, y) are running coordinates. 
(a) Prove that C is also the polar reciprocal of C’. 
(b) Find the polar reciprocal of the circle (x — a)? + (y — 6)? = 1. 
(c) Find the polar reciprocal of the ellipse x?/a? + y?/b? = 1. 


6. A circle of radius a rolls on a fixed straight line, carrying a tangent 
fixed relatively to the circle. Taking axes at the point of contact where 
the moving tangent coincides with the fixed line, show that the en- 
velope of the tangent is given by 


x = a(8 + cos 9 sin 6 — sin 9) 
y = a(cos?6 — cos 8). 
7. Find the envelope of a variable circle in a plane which passes through 
a fixed point O, and whose center describes a given conic with center 
O. 

8. (a) IfT is a plane curve and O a point in its plane, the locus I’ of the 
orthogonal projections of O on a variable tangent of T is called the 
pedal curve of T with respect to the point O. Prove that if the point 
M describes the curve T, the pedal curve T” is the envelope of the 
variable circle with the radius vector OM as diameter. 


(b) What is the envelope like if T is a circle and O a point on its cir- 
cumference? 


9. MM’ is a variable chord of an ellipse parallel to the minor axis. Find 
the envelope of the variable circle with MM’ as diameter. 


d. Envelopes of Families of Surfaces 


The remarks made about the envelopes of families of curves apply 
with but little alteration to families of surfaces also. Given a one- 
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parameter family of surfaces f(x, y, z, c) = 0 defined for an interval of 
parameter values c, we shall say that a surface E is the envelope of the 
family if it touches each surface of the family along a whole curve and 
if, further, these curves of contact form a one-parameter family of 
curves on E that completely cover E. 

An example is given by the family of all spheres of unit radius with 
centers on the z-axis. We see intuitively that the envelope is the cyl- 
inder x? + y? — 1 = 0 with unit radius and axis along the z-axis; the 
family of curves of contact 1s simply the family of circles parallel to 
the x, y-plane, with unit radius and center on the z-axis.! 

As on p. 292, if we assume that the envelope does exist we can find 
it by the following heuristic method: We first consider surfaces 
f(x, y, z,c) = O and f(x,y,z, c + h) = Ocorresponding to two different 
parameter values c and c + h. These two equations determine the 
curve of intersection of the two surfaces (we expressly assume that 
such a curve of intersection exists). As a consequence of the two equa- 
tions above, this curve also satisfies the third equation 


f(x,y, z,c + h) — f(x, y, z,c) _ 0 
h — e 


If we let h tend to zero, the curve of intersection will approach a defi- 
nite limiting position, and this limit curve is determined by the two 
equations 


(54) f(x, Y, Z, c) = 0, fex, Y, 2, c) = 0. 


This curve is often referred to in a nonrigorous intuitive way as the in- 
tersection of neighboring surfaces of the family. It is a function of the 
parameter c, so that the curves of intersection for all the different 
values of c form a one-parameter family of curves in space. If we elim- 
inate the quantity c from the two equations above, we obtain an 
equation that is called the discriminant. As on p. 293, we can show that 
the envelope must satisfy this discriminant equation. 

Just as in the case of plane curves, we may readily convince our- 
selves that a plane touching the discriminant surface also touches the 
corresponding surface of the family, provided that fz? + fy? + fz? # 0. 
Hence, the discriminant surface again gives the envelopes of the 
family and the loci of the singularities of the surfaces of the family. 

As a first example, we consider the family of spheres 


1The envelope of spheres of constant radius whose centers lie on a given curve are 
called tube-surfaces. 


Developments and Applications of the Differential Calculus 305 
x2+ y2+(2-—c)?—1=0 


mentioned above. To find the envelope we have the additional equa- 
tion 


—2(z — c) = Q. 


For fixed values of c these two equations obviously represent the circle 
of unit radius parallel to the x, y-plane at the height z = c. If we elim- 
inate the parameter c between the two equations, we obtain the 
equation of the envelope in the form x? + y? — 1 = 0, which is the 
equation of the right circular cylinder with unit radius and the z-axis. 
For families of surfaces it is also possible to find envelopes of two- 
parameter families f(x, y, z, ci, cz) = 0. (For families of curves, how- 
ever, the concept of envelope has a meaning only for one-parameter 
families.) For example, we consider the family of all spheres with unit 
radius and center on the x, y-plane, represented by the equation 


(x — c1)? + (y — co)? + 227 -1=0. 


Intuition tells us at once that the two planes z = land z= — 1 touch 
a surface of the family at every point. In general, we shall say that a 
surface EF is the envelope of a two-parameter family of surfaces if at 
every point P of E the surface E touches a surface of the family in such 
a way that as P ranges over E, the parameter values c1, cz correspond- 
ing to the surface touching E at P range over a region of the c1,c2- 
plane, and in addition different points (ci, c2) correspond to different 
points P of E. A surface of the family then touches the envelope at a 
point and not, as before, along a whole curve. 

With assumptions similar to those made in the case of plane curves, 
we find that the point of contact of a surface of the family with the en- 
velope, if it exists, must satisfy the equations 


f(x, Y, Z, c1, C2) = 0, fey(x, Y, Z, c1, C2) = 0, feo(x, Y, Z, C1, C2) = 0. 


From these three equations we determine the point of contact of a 
given surface of the family by assigning the corresponding values to 
the parameters. Conversely, if we eliminate the parameters cı and c2, 
we obtain an equation that the envelope must satisfy. 

For example, the family of spheres with unit radius and center on 
the x, y-plane is given by the equation 


f(x, Y, Z, C1, C2) = (x — c1)? + (y — c2)? + 22? -1=0 
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with the two parameters ci and c2. The rule for forming the envelope 
gives the two equations 


fe; = —2(x — C1) = 0 and fez = —2(y — C2) = 0. 


Thus, for the discriminant equation, we have z? — 1 = 0, and in fact, 
the two planes z = 1 and z = — 1 are envelopes, as we have already 
seen intuitively. 


Exercises 3.5d 


1. What is the envelope of the family of ellipsoids of constant volume 
(i.e., fixed product of the semiaxes) with common center at O and axes 
parallel to the coordinate axes? 


2. What is the envelope of the family of planes ax + by + cz = 1, where 
Va? + b? + c? = 1? 
3. (a) Find the envelope of the two-parameter family of planes for which 
OP + OQ + OR = constant = 1, 
where P, Q, R denote the points of intersection of the planes with 
the coordinate axes and O the origin. 
(b) Find the envelope of the planes for which 


OP? + OQ? + OR? = 1. 
4. A family of planes is given by 
x cost+y sint+2=4, 


where ¢ is a parameter. 

(a) Find the equation of the envelope for the planes in cylindrical 
coordinates (r, z, 9). 

(b) Prove that the envelope consists of the tangents to a certain curve. 


5. Let z = u(x, y) be the equation of a tube-surface, that is, the envelope 
of a family of spheres of unit radius with their centers on some curve 
y = f(x) in the x, y-plane. Prove that u? (uz? + uy? + 1) =1. 

6. Find the envelope of the family of spheres that touch the three spheres 


e 3 2 2 2_9 
Si: | J +y*+2 =} 
3\? 9 
. x2 —2 2—7 
Sa: x + (y 4 + a= 4? 
3\2 9 
© m2 2 en —g2 
S3: x2 + y? + (2 4 =4 
7. LetT be a plane curve andI” its pedal curve as described in Exercise 8, 


p. 303 
(a) Let M be a point describing the curvel'. What is the envelope of the 
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variable sphere with the radius vector OM as diameter? 


(b) What is the envelopeof the variable spheres if I is a circle and O 
a point on its circumference? 


8. Show that the surface xyz = constant is the envelope of the family of 
planes that form, with the coordinate planes, a tetrahedron of constant 
volume (i.e., fixed product of the intercepts). 


9. A plane moves so as to touch the parabolas z = 0, y? = 4x and y = 0, 
z? = 4x. Show that its envelope consists of two parabolic cylinders. 


3.6 Alternating Differential Forms 


a. Definition of Alternating Differential Forms 


In Chapter 1 (p. 84) we considered the general linear differential 
form 


(55a) L = A(x, y, z) dx + B(x, y, z) dy + C(x, y, z) dz 


in three independent variables. Along any curve I with parameter 
representation x = d(t), y = w(t), z = x(t) the form L determines values 


L ae, ply, odz aps pea cs 
(55b) din at + Pa + Ca = AG + BY + Ch, 


which depend on the special parametric representation of I. If T is 
referred to a different parameter t, we obtain 


L dx, pdy, pdz_j,dx | pdy | odz\dt 
(55c) d = AG t+ BE + CE =(AG + Bar + Car lax 
_L dt 
~~ dt dt 


However, the integral 


[L= fga = fA + Be + cE lat 


depends only on the curve I (and its orientation) and not on the partic- 
ular parametric representation. 

Similarly, we can consider a differential form œ which is quadratic 
in dx, dy, and dz, namely, a linear combination œ of the symbols 
dx dx, dx dy, dx dz, dy dx, dy dy, dy dz, dz dx, dz dy, dz dz with coeffi- 
cients that are functions of x, y, z. Upon any surface S in space with 
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parametric representation x = g(s, t), y = w(s, t), z = x(s, t), the form 
œ defines values w/ds dt if we agree that the quotients 


dxdx dxdy dxdz 
ds dt’ ds dt’ ds dt’ `? 


are to stand respectively for the Jacobians 


d(x, x) d(x,y) d(x, 2) i 
d(s, t)’ d(s,t)’ d(s,t)’°° °° 


We do not distinguish between two differential forms that yield the 
same values o/ds dt at each point of the surface. In view of the alter- 
nating character of determinants, namely, that. 


dx, x) 5 Axy) _ Uy, x) 
d(s, t) ? d(s, t) d(s,t)’° °°’ 


we see that the terms of œ with dx dx, dy dy, dz dz make no contribu- 
tions and that dy dx, dz dy, dx dz can be replaced respectively by 
—dxdy,—dy dz,—dz dx. Thus the most general quadratic differential 
form in dx, dy, dz can be written as 


(56a) œ = a(x, y, z) dy dz + b(x, y, z) dz dx + c(x, y, z) dx dy. 


The values that œ associates with the points of a surface S referred to 
parameters s, t are 


(56b) FAPT = a(x, y, 2) ae z) + b(x, y, 2) a p + c(x, y, 2) Te 2) . 


Giving S different parameters s’, t', we obtain from the multiplication 
law for Jacobians (see p. 258) 


o —  d(y,2) d(z, x) a(x, y) 
(56c) ds’ dt! ~ a d S’, t’) +b d(s’, t’) +c d(s’, t’) 
© ds, t) 


— 


ds dt d(s’, t)’ 


Later (p. 593), we shall also define the double integral 


1This convention characterizes alternating differential forms. In other contexts, 
nonalternating quadratic differential forms are encountered as well, such as the one 
giving the square of the line element in space or on a surface (see p. 283): 


ds? = dx? + dy? + dz? = Edu? + 2F du dv + G dv?. 
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IE 


and see that it does not depend on the particular parameter repre- 
sentation of the surface S. 

In a similar way, we can consider a differential form œ that is cubic 
in dx, dy, dz. Such a form assigns values o/dr ds dt corresponding to 
any parametric representation 


Xx = g(r, S, t), y = y(r, S, t), zZ = x(r, S, t), 
where again we interpret the quotients 


dxdxdx dx dy dz 
dr dsdt’ dr dsdt °° ` ° 


as the Jacobians 


d(x, x, x) d(x, y, z) 
d(r, s, t)’ d(r,s, |)? 7 


Since the Jacobians vanish when two of the dependent variables are 
identical and change signs when two of the dependent variables are 
interchanged, the cubic differential forms in the three independent 
variables x, y, 2 are all of the type 


(56d) o = a(x, y, z)dx dy dz. 


Whenever x, y, z are represented as functions of r,s, t, we obtain from 
œ the value 


0O d(x, y, z) 
(56e) dr ds dt ~ "Days 4) 


Proceeding in the same manner we could define “‘alternating”’ dif- 
ferential forms in dx, dy, dz of degrees 4, 5, . . .. But all of these are 
identically 0, since any Jacobians of orders 4, 5, . . . that we could 
form would have two of the dependent variables identical, and, hence, 
would vanish.! 


1Higher-order forms have, however, a nontrivial meaning in spaces of higher di- 
mensions. In four-dimensional x, y, z, u-space the most general alternating dif- 
ferential forms of order 1, 2, 3, 4 can be written as 


(56f) Adx+ Bdy + Cdz+ Ddu 
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Exercises 3.6a 


1. Find w/du dv for each of the following: 
(a) w= x dy dz + y dz dx + z dx dy, 
x = cos u sin v, y= sin usin v, z= cos v 
(b) o = (y — z)dy dz + (z — x)dz dx + (x — y)dx dy, 
x =au + bv, y= bu + cv, z=cu+av 
(c) w = dy dz + dz dx + dx dy, 


x = u? + V, y=2uv, z= u? — vè. 


b. Sums and Products of Differential Forms 


Two differential forms of the same order (i.e., either both linear, 
both quadratic, or both cubic) can be added trivially by adding cor- 
responding coefficients. Thus, for 


@1 = ai dy dz + bı dz dx + cı dx dy, 
œz = az dy dz + b2 dz dx + cz dx dy, 


we define 


(57a) œı + @2 = (a1 + az)dy dz + (bı + bz)dz dx + (cı + cz)dx dy. 


We can define the product @i@2 of any two differential forms œ1ı 
and œz of the same or of different orders by just substituting for a1 
and œz their expressions in terms of dx, dy, dz and applying the dis- 
tributive law of multiphcation, taking care, however, to preserve the 
original order of the differentials in each term.! Thus, the product 
of the two linear forms 


@ı = Aidx+ Bi dy + Ci dz and We = A2 dx + Be dy + C: dz 


would be the quadratic form 


(56g) Adxdy + B dy dz + Cdzdu+ D du dx + Edxdz+F dydu 
(56h) A dy dzdu + B dz du dx + C du dx dy + D dx dy dz 
(561) A dx dy dz du, 


respectively, with coefficients A, B, . . ., which are functions of x, y, z, u. Forms of 
order higher than 4 vanish. 
1The product formed in this way is sometimes denoted by the symbol @1 A @2. 
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(57b) 102 = (Ai dx + Bi dy + Ci dz)(Az dx + B2 dy + C2 dz) 
= AiAe dx dx + Ai Be dx dy + AiC2 dx dz + BiA2 dy dx 
+ Bi Bz dy dy + BiC2 dy dz + CiA2 dz dx 
+ Ci Bz dz dy + CiC2 dz dz 
= (BiC2 — CiBz)dy dz + (C142 — AıC2)dz dx 
+ (A1B2 — BiAz)dx dy. 


If we describe the individual forms @1 and @2 by the “coefficient vec- 
tors” Ri = (A1, Bı, Ci) and Re = (Ae, B2, C2), then the coefficients of 
the product @1@2 are just the components of the vector product Ri x Re 
(see p. 181). Clearly, the product of the forms is not commutative. 
Here, for example, 0102 = — 0201. 

Multiplying the first-order form 


©ı = A dx + Bdy + Cdz 


with the second-order form 


oz = a dy dz + b dz dx + c dx dy, 
we obtain similarly 


(57c) wim, = (A dx + B dy + Cdz)(ady dz + b dz dx + c dx dy) 
= Aa dx dy dz + Ab dx dz dx + Ac dx dx dy 
+ Ba dy dy dz + Bb dy dz dx + Bc dy dx dy 
+ Ca dz dy dz + Cb dz dz dx + Cc dz dx dy 
= (Aa + Bb + Cc)dx dy dz. 


We observe that in this case the coefficient of œ102 is the scalar product 
of the coefficient vectors (A, B, C) and (a, b, c). Here, incidentally, 
01 W2 = We 01. 

Forming the product of a first- and a third-order form, of two 
second-order forms, or of a second- and a third-order form yields forms 
of order higher than 3, which vanish. For the sake of completeness 
it is convenient to define differential forms of order 0 as the scalars 
a(x, y, z). The product of a form a of order 0 with a form o of any order 
k = 0,1, 2, 3is then obtained by multiplying each of the coefficients 
of œ by the scalar a. 

It is easily seen from the definition that products of differential 
forms are associative. For three linear forms 
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Lı = Aıdx + Bidy+ Cidz (i = 1, 2, 3). 


for example, as is to be proved in Exercise 5, 


A Bi GQ 
(57d) In(L2L3) = | A2 Be C2 | dx dy dz. 
As Bs Cs 


and for (Lı L2) L3 we obtain the same evaluation. 
_ Ofcourse, a greater variety of products of differential forms can be 
formed when the number of independent variables is greater than 3. 


Exercises 3.6b 


1. Evaluate the following products: 
(a) (x dx + y dy)(x dx — y dy) 
(b) [(x? + y*)dx + 2xy dy] [2xy dx + (x? — y)dy] 
(c) (adx + bdy)(a dy dz + b dz dx + c dx dy) 
(d) (dx + dy + dz)(dy dz — dx dy). 


2. For any form v of order 1 in x, y, z, show that ? = 0. 
3. For first-order forms w1, we in three variables, show that 


(1 + w2)(@1 — we) = 20201. 
4. Show for first-order forms in three variables that 
(1 + w2 + w3 + wa)(@1 — w2 + w3 — w4) = 2(we + w4)(@1 + %3). 
5. Derive (57d). 


c. Exterior Derivatives of Differential Forms 


For a differential form of order 0, that is, for a scalar a(x, y, z) 
we have by definition 


(58a) da = ay dx + ay dy + a, dz. 


The coefficients of this differential form are just the components of the 
vector we denoted by grad a on p. 206. More generally, we define the 
exterior derivative dw of any differential form œ. For this purpose, we 
write out @ as a sum of terms where each term is a product of certain 
of the differentials dx, dy, dz preceded by a scalar factor and replace 
each of the scalar factors by its differential, formed in the ordinary 
sense. Thus, for a first order form 
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L = A dx + Bdy + C dz, 
we find for dL the second-order differential form 


(58b) 
dL = dA dx + dB dy + dC dz 
= (Az dx + Ay dy + Az dz)dx 
+ (Bz dx + By dy + B: dzìdy+ (Cz dx+ Cy dy + Cz dz)dz 
= (Cy — Bz)dy dz + (Az — C,)dz dx + (Bz — Ay)dx dy. 
If we associate with L the vector R = (A, B, C), we have the remarkable 
fact that the coefficients of dL are just the components of the curl of R 


(see p. 209). 
For a second-order form 


o = a dy dz + b dz dx + c dx dy 
the exterior derivative do is the third-order form 


(58c) dw = da dy dz + db dz dx + dc dx dy 

= (az dx + Qy dy + az dz)dy dz 

+ (cz dx + cy dy + cz dz)dx dy 

= (az + by + cz)dx dy dz. 
Hence, if the coefficients of œ are combined into the vector R = 
(a, b, c), then the coefficient of dw is the scalar div R (see p. 210). 

The derivative of a third-order differential form is of fourth order 

and, hence, vanishes. 


An important general rule (“Poincaré lemma”) is that the second 
exterior derivative of any differential form œ vanishes: 


(58d) ddo = 0. 


In three-space this only has to be proved for the cases where either 
is of order 0 or 1. Now if o is a scalar a(x, y, 2), we have by (58a, b) 


dœ = d(a; dx + ay dy + az dz) = 0. 


This is really only a different way of expressing the rule stated on 
p. 210 that curl (grad a) = 0 for any scalar a. Similarly, we find from 
(58b, c) for the case of a first-order differential form 
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o= Adx+ Bdy+ Cdz 
that 


dœ = d{(Cy — Bz)dy dz + (Az — Cz)dzdx + (Bz — Ay)dx dy] = 0. 


This again is nothing else but the rule div (curl R) = 0 valid for any 
vector R (see p. 211). 

The inverse problem of finding a form t that has a given form o as 
its exterior derivative is basic. We should like to represent a given 
differential form @ as 


(58e) © = dt 


with a suitable differential form t. We call an exact, or total, differ- 
ential when such a representation is possible. Applying rule (59) to 
the differential t, we see that a necessary condition for œ to be an exact 
differential is that dw = 0.1 It turns out that this condition is also suf- 
ficient; that is, for dw = 0 the equation (58e) has a solution t, provided 
we restrict ourselves to a rectangular neighborhood of a point (Xo, Yo, 
zo) interior to the domain of definition? of o. 

We prove this statement separately for each order of œ. If œ is of 
order 1, say 


œo = A dx + Bdy + Cdz, 
then, by (58b), the condition dœ = 0 is equivalent to the relations 
(58f) C, — B: = 0, Az — Cz = 0, Bz — Ay = 0. 


But these are just the integrability conditions that permit us to rep- 
resent as the total differential of some function f, provided we re- 
strict the point (x, y, z) to a rectangular parallelepiped containing 
(xo, Yo, Zo) or, more generally, to a simply connected set (see p. 104). 

For o of order 2, 


o = a dy dz + bdzdx+cdxdy, 
the condition dw = 0 by (58c) is equivalent to 


(58g) az + by + cz = 0. 


1Forms @ for which do = 0 are called closed. 
2We always assume that the differential forms considered here have coefficients 
with as many continuous derivatives as are needed for our arguments to hold. 
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Assume that this condition is satisfied in the rectangular parallel- 
epiped 


|x — xol<ri, |y — yol < re, |z — zo| < rs. 
We have to show that œ = dt, where t is of the form 
t = A dx + Bdy + C dz. 
This means functions A, B, C have to be found for which 
a = Cy — Bz, b = Az — Cz, c= Bz — Ay. 


We try to satisfy these equations with the choice C = 0. Then A and 
B have to be of the form 


A(x, y, z) = a(x, y) + f, , b(x, y, 5) dọ, 


B(x, y, 2) = B(x, 3) — J" a(x, ¥, 0) dt 


in order to satisfy the first two equations. It follows, using condition 
(58g), that 


ð 0 
2 (Bs - Ay) = ay, Be ~ By Ae = — ay — by = Cz. 


Hence Bz — Ay — c does not depend on z. The third equation c = 
Bz — Ay will be satisfied for all z in question if it holds for z = 20. 


Hence, we only have to determine the functions a(x, y) and B(x, y) in 
such a way that 


Ba(x, y) — Ay(x, y) = c(x, Y, Zo). 


This is achieved by taking 


a(x, 9) = 0, Bx, 9) = J" cE, y, 20)dé, 


for example. 
Finally, for a third-order operator 


œo = a(x, y, z)dx dy dz 


816 Introduction to Calculus and Analysis, Vol. II 


the condition do = 0 is always satisfied. We want to represent œ in 
the form œ = dt, where t is a second-order differential form 


t = a dy dz + b dz dx + c dx dy. 
By (58c) this amounts to finding functions a, b, c for which 
az + by +cz,= a. 


One solution clearly is given by 
a(x, y, z) = b(x, y, z2) = 0, c(x,y,z) = J f a(x, y, Cao. 
This proves our theorem. 


Exercises 3.6c 


1. Evaluate do for each of the following: 
(a) œ% = arc tan y/x 
(b) œ = y dx — x dy 
(c) o = f(x, y) dx dy 
(d) œ = x? cos y sin z dy dz — x sin y sin z dz dx + x cos z dx dy 
(e) œ = (22 — y?)x dy dz + (x? — z?3y dz dx + (y2 — x?)z dx dy. 
2. For first-order forms in three variables, show that 
d(w102) = o1(dwe2) + (doia. 


3. Show that any product of exact first-order forms in three variables is 
exact. 


d. Exterior Differential Forms in Arbitrary Coordinates 


So far, we have always looked at differential forms as linear 
combinations of alternating products of the differentials dx, dy, dz 
of the Cartesian coordinates x, y, z in space. We made essential use of 
this representation of forms in terms of dx, dy, dz in defining the 
product of two forms and the derivative of a form. The usefulness of 
alternating differential forms in applications depends on the fact that 
these forms can be defined and operations on forms can be performed 
in the same way when three-dimensional! euclidean space is referred 


1The dimension 3 is chosen here only for the sake of definiteness. All these consider- 
ations are equally valid for any other number of dimensions. 
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to any curvilinear coordinates u, v, w. More generally, this holds on 
any noneuclidean three-dimensional space or manifold! referred to 
parameters u, v, w, for example, on a three-dimensional “surface” in 
four-dimensional euclidean space. What is important is that oper- 
ations on forms can be defined in an invariant manner, without refer- 
ence to a special coordinate system, and that the resulting formulae 
look the same in every system. 

In this context, one thinks of the points P of the three-dimensional 
space or of a manifold >) as geometric objects that exist independently 
of any coordinate system. A scalar f is a function of P with real 
numbers as values (that is, a mapping of >} into the real number axis). 
There are, however, many ways of describing points P by curvilinear 
coordinates, that is, by triples of numbers (u, v, w), for example, by 
rectangular coordinates or spherical coordinates in euclidean space. 
We always assume that any two such coordinate systems, say u, V, w 
and u’, uv’, w’, are related by transformation equations 


u’ = gu, U, w), v’ = y(u, U, w), w’ = x(u, U, w), 


where ¢, y, x are continuous functions with as many continuous 
derivatives as required for our operations, and with a Jacobian 
d(u’, v’, w”) 
d(u, v, w) 
by similar formulae in terms of v’, v’, w’. In a given coordinate system 
u, v, wa scalar f = f(P) becomes a function f(u, v, w) of the coordinates 
u, v, w of the point P. In different coordinate systems, the functions 
representing the same scalar are generally quite different. 

On the manifold >} let C be a curve with the parametric represen- 
tation P = P(t); with every real number ¢ of a certain interval the 
parametric equation associates a point P of the manifold X. Any 
scalar f(P) defined on J, yields a function of t along C obtained by 
forming the composition f(P(é)). If this function is differentiable, it 
makes sense to form the derivative df/dt, which is defined for the given 
curve and parametric representation of C, independently of any curvi- 
linear coordinate system used for >>. In a given coordinate system the 
coordinates u, v, w of a point P themselves are functions u = u(t), 

= u(t), w = w(t); and f(P(ù) is given by the compound function 


that does not vanish.? In that case u, v, w can be expressed 


‘Generally we use the term “manifold” to denote a parametrically given set of any 
number of dimensions m S n in n-dimensional euclidean space. 

The particular representation of the transformation involving univalued functions 
$, W, x needs to be valid only locally, that is, in a sufficiently small neighborhood of 
some point. 
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f(u(6), v(t), w(t)). Assuming f(u, v, w) and u(t), v(t), w(t) to have continu- 
ous derivatives, we find from the chain rule of differentiation that in 
the particular u, v, w-system df/dt takes the form 


df of du , of dv _ of dw 


(59) dt du dt ` dv dt ' dw dt ` 


A zero-order differential form in >; is just a scalar f. The general 
first-order differential form is defined as a formal expression of the 


type 


N 
© = >) a dfi, 


1=1 


where ai,.. ., aN, fi, . . ., fy are given scalars. Along any curve C 
referred to a parameter t, we associate with œ the function of t, de- 
noted by œ/dt, which is defined by 


Two forms 


o=S ad; and œ= $ bi dg: 


1=1 


are considered equal if 


for any curve C and any parameter t along C. 
In a particular u, v, w-coordinate system o/dt becomes 


© _ A (efi du | ofi dv _ ofi dw) _ , du dv dw 
n= ul at oo det Ow de =A tbat oa: 


where 


A= Sal, Bada, C= Sa, 


i 
Si oou Si ov =1 ôw 


are scalars defined in >). By our definition of equality of first-order 
differential forms, we can write œ as 
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o= A du + Bdv+ Cdw 


Here the coefficients A, B, C of œ referred to a particular coordinate 
system uw, v, w are determined uniquely, for if we take for the curve C 
a “coordinate line,” say u = t, v = constant, w = constant, we find 


O 
dt du A, 
and similarly, 
O O 
do 7 B, dw ~ C 


Thus, in any particular coordinate system u, v, w, we can write @ as 


0) 
—— dw, 


(du + 7 du +4 
wW 


(60) o= u” t a 
where œ/du really stands for the partial derivative formed along a 
curve where v and w are constant. This formula can be regarded as an 
extension of the chain rule (59) from the differential df of any scalar f 
to a general first-order differential form o. 

We can define now in exactly the same manner a second-order alter- 
nating differential form œ as a formal expression of the type 


N 
(61a) © O= 2 ai dfi dgi, 
where a, . . ., an, fi,..., fN, £1,.- -, gw are scalars defined on J.. 


On any surface S in >} referred to parameters s, t, we associate with 
the form oœ the values w/ds dt defined by 


af; afi 

d(fi, gi) N os ot 

ds di = 22 “de, D t) =H“ Ogi Og: 
Os ot 


(61b) 


Two forms œ and œ’, although represented with the help of different 
scalars, are considered identical when they determine the same values 
w/ds dt = w'/ds dt on each surface for every parameter representa- 
tion. Now in any particular coordinate system u, v, w we havefor two 
scalars f, g 
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fs ft fuus + fos + fwws fuut + fove + fwwt 
gs gi Lulls + ZvUs + ZwWs Buut + Sot + ZwWi 
= (fogw — fug v)(Uswe — vis) + (fugu — fugw(Wsut — wes) 
+ (fugv — foSu)(Usve — utvs); 
hence, 
Oo _ q Wu: w) d(w, u) d(u, v) 
(61c) ds di 7 ads, t) + l ds, D + ° de D’ 
where 
N Ufe 8i) gi) N ahi, gi) 
(61d) = 2% d(v, w) ’ b = 2 t d(w, u)’ 
N hese 
= 2% d(u, v) ` 
Thus, we can write @ in the u, v, w-system as 
(6le) œ = a dv dw + b dw du + c du dv. 


The coefficients a, b, c in this representation of œ are again determined 
uniquely; they are given by 


@ 


_ Z 0 b= -2 __ Oo 
"= dv dw’ —dwdu’ °~ du dv’ 


where a = œ/dvdw is formed with respect to a coordinate surface 
v = s,w = t, u = constant, and similarly for b and c. In the u, v, w- 
system the symbolic expression (61c) for œ becomes 


——— dw du + =—.. du dv, 


=—,— dv dw + 5 _,- du T 


(61£) o = 


do du To du 


in analogy to the formula (60) for first-order differential forms.! 


1Formulae (61a, b) retain their validity for second-order forms in n-dimensional 


space referred to parameters U1, . . ., Un. Instead of (61c, d, e, f), we have then 
(61g) o= >» Ajx du; duk, 
TMZ Jove 
1< 
where 
(61h) Age = 3 a usud Uf, g) _ _O_ 


d(uj,ux) dujydux’ 


as is easily verified. 
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We define the product LM of two first-order forms 


(62a) L=Sadi, M= $ br dgk 


on a surface with parameters s, t, as that second-order form @, for 
which 


A LM LM 
(62b) sdi” ds dt dt ds 


= Sa SS be BE nad noe 


Mice) 
= 2, ube “ds i) y i 


Consequently, if L and M are given by (62a), LM can be identified with 
the second-order form 


(62c) 0 = 2 abr dfi dgk. 


However, the definition of œ/ds dt = LM/ds dt given by (62b) does not 
depend on the particular representation of L and M in terms of scalars 
ai, fi, bx, gr; hence, formula (62c) must represent the same form œ = 
LM for all representations of the factors L, M. 

Another way of generating second-order forms from those of first 
order is by differentiation. Given the first-order form 


(63a) L = >> aidfi 


we can define dL without reference to any particular coordinate 
system by the prescription 


d aL ôL 
(63b) Gs dt = di dt ds 


af aa of 
T Didi ae — Bg 2 “as 


da: Of: dai oft) _ ~~ dla, fi) 
= 2 (ae 3t ât 3s) = a det) t) ` 


1Here M/ds and M/dt denote “partial” differentiation (or derivatives) with ¢ and s, 
respectively, held constant. (A consistent distinction between ordinary and partial 
differentiation can hardly be made.) 
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This is equivalent to the formula 


(63c) dL = da dfi, 


and shows that the second-order form dL does not depend on the 
particular representation (63a) of L in terms of the scalars a, fi. It is 
the natural generalization of formula (58b) for the special case of the 
derivative of a form L expressed as L = A dx + B dy + C dz. 

In the particular case where the first-order form L is a total differ- 
ential—that is, L = df with a scalar f—we find, of course, from (63c) 
that dL = 0. Hence, for a 0-order operator f, the rule 


ddf = 0 


is verified. When L is represented in terms of a particular coordinate 
system u, v, w in space by the standard form 


L = A du + Bdv + C duw, 
we find from (61f), (63b) 
dL = dA du + dB dv + dC dw 


dL dL dL 
= o dwt’ @ + Go du „dw du + zy duo du dv 

aL 3 L aL au 
=(55 dw Bw J2 dw + lio da aw gge d 


+ ir om ay au) d 


= (Cy — By) du dw + (Aw — C,) dw du + (By, — Av) du dv, 


in agreement with formula (58b). 

If dL = 0, we obtain as before that C, — By = Aw — Cy = Bu — Av 
= 0. It follows that locally there exists a scalar f for which A = fu, 
B= fo, C= fw or L = df. 

Finally, a third-order alternating differential form is defined by a 
formal expression 


(64a) O = $, ai dfi dgi dhi 


with scalars a, fi, gi, hi. In any parameter system r, s, tin space it de- 
fines the values 
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©  & difi, gi, hi) 
(64b) dr ds dt — ps u d(r,s,t) ` 


With reference to a particular u, v, w-coordinate system, we can write 


= Sa d(fi, gi, hi) d(u, v, w) 
dr a dt £i * du, v,w) dr,s,t) ` 


(64c) 


This amounts to the identity 


(64d) o = a du dv dw, 
where 

— Ša Mh se) 
(64e) a = pak dlu v w) ` 


We can define the product Lo of a first-order form 
L = >, a dfi 
t 
and a second-order form 


o = 2, bz dgr dhk 


by specifying that 
Lo L oœ L o L o 
ds dtdr dt drds 


_ ofi Age, he) , Ifi Age, hk) , Of: dg, hx) 
=% 7 ubela d(s, t) 36 dt, r) “at d(r, S| 


_ Afi, gx, hx) 
= Pudi ar st) 


This amounts to the formula 


1In n-dimensional space referred to parameters u1, . . ., Un, we have instead of (64c, 
d, e) the formula 
QO = > Ajkm du; duk dum, 
j.k,m=1,...n 
j<k<m 
where 


d(fi, gi, hi) A) 
Atm = 2 a d(uj, Ux, Um)  dujduy dum’ 
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(65a) Lo = 2 aibr dfi dgr dhx, 
t» 


as could be expected from the formal multiplication of expressions for 
L and œ. When L and o are in their standard form 


L = Adu + Bdv + Cdw, œ = a dvdw + b dw du + cdu dv 
for a given u, v, w-coordinate system, the product becomes 
(65b) Lo = (Aa + Bb + Ce)du dv du, 


in accordance with (57c). 
The derivative of the second-order form 


© = dia dgi dhi 


can be defined independently of special coordinate systems by the rule 


do _ð o +2 O 4 ô @ 
drdsdt or dsdt + 95 didr ot drds 
3 d(gi, hi) Agi, hi) 3 d(gi, hi) 
=o" as, ) T as > dit, r) + a dr s) 
Thus, 
do d(ai, gi, hi) 
(66a) drdsdi > dir s, t) ’ 


as one verifies easily. Hence, our definition of dœ implies 


(66b) do = } da: dgi dh. 


For in the standard form 


(66c) o= a dv dw + b dw du + c du dv 
we obtain 
(66d) do = (dy + by + cw)du du dw. 


This special representation for dœ can again be used as on p. 315 to 
show that a second-order form œ with dw = 0 is representable locally 
as œ = dL, where L is a suitable first-order differential form. 
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Exercises 3.6d 


1. In spherical coordinates, x = p sin ¢ cos 9, y = p sin ġ sin 0, z = pọ cos ¢, 
choose unit vectors u, v, w, in the direction of the r, ¢, 9 lines, re- 
spectively. Show that dX = (dx, dy, dz) = ude + vedd + wp sin ¢d0. 
Hence, find the expression for vf(p, ¢, 8) in spherical coordinates, where 
vf is defined by vf « dX = df. 


3.7 Maxima and Minima 


a. Necessary Conditions 


For functions of several variables, as for functions of a single vari- 
able, one of the most important applications of differentiation is the 
theory of maxima and minima. 

We shall begin by considering a function u = f(x, y) of two in- 
dependent variables x, y. The domain of the function shall be a certain 
set Rin the x, y-plane. We can represent f in x, y, z-space by the surface 
S with equation z = f(x, y). We say that f(x, y) has a maximum! at the 
point (xo, yo) of its domain R if f(xo, yo) = f(x, y) for all (x, y) in R.Such 
a maximum corresponds to a highest point of the surface S. We talk of 
a strict maximum if actually f(xo, yo) > f(x, y) for all (x, y) in R that 
are different from (xo, yo), so that the greatest value of the function is 
reached only at the single point (xo, yo). Similarly, f(x, y) is said to 
have a minimum at the point (xı, yı) of R if f(x1, yı) < f(x, y) for all 
(x, y) in R, and a strict minimum if f(x1, y1) < f(x, y) for all (x, y) + 
(xı, y1) in R. The basic theorem of p. 112 assures us that if R is a closed 
and bounded set and f continuous in R, then there exist points in R 
where f has its maximum and also points where f has its minimum. 

As an example consider the function u = x? + y? in the closed disc 
given by x? + y? < 1. The surface S is the portion of the paraboloid 
of revolution z = x? + y? lying below the plane z= 1. Here the 
maxima of f occur at all the points of the boundary circle x? + y? = 1, 
whereas f has a strict minimum at the origin. 

Calculus applies directly to the determination of relative maxima 
or minima, rather than of absolute extrema. A point (xo, yo) of the 
domain Ff is a relative maximum if f(xo, yo) Z f(x, y) for all points 
(x, y) of R that lie in a sufficiently small neighborhood of (xo, yo). The 
value f(xo, yo) at a relative maximum does not have to be the greatest 
value of fin all of R but is a maximum of f if we restrict ourselves to 


1Also called absolute maximum in contrast to the relative maximum defined below. 
The terminology used here is exactly the same as for functions of a single variable; 
see Volume I (pp. 2388 ff.). 
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points sufficiently close to (xo, yo). Relative minima are defined 
analogously. Every absolute maximum (minimum) also is a relative 
maximum (minimum), but the converse does not hold. 

For example, the function u = (x? + y3 — 3(x? + y?), whose do- 
main shall be the open disc x? + y? < 4, has no maximum but does 
have a relative maximum at the origin. All points on the circle x? + y? 
= ] are minimum points. Here the surface S is generated by rotating 
the curve z = xê — 3x? about the z-axis. 

The definitions of absolute or relative minima for functions u = 
f(x, Y, z,. . .) of more independent variables are entirely similar. 

We shall first give necessary conditions for the occurrence of a rel- 
ative maximum or minimum at an interior point (xo, yo) of the domain 
R of the function f(x, y). We use the term relative extremum to include 
both maxima and minima. Let now (xo, yo) be an interior point of the 
domain È of the function f(x, y), and let f have partial derivatives 
fa(xo, Yo), fy(xo, yo) at that point. For a relative extremum of f to occur 
at the point (xo, yo), it is necessary that 


(67a) fa(xo, yo) = 0,  —fy( Xo, yo) = O. 


The conditions (67a) follow at once from the known conditions 
for functions of a single variable. Put ¢(x) = f(x, yo). Then g(x) is 
defined for all x sufficiently close to xo and has at xo the derivative 
b(x0) = fx(x0, yo). If f(x0, yo) 2 f(x, y) for all (x, y) in R that are suff- 
ciently close to (xo, yo), then, in particular, ¢(xo) = g(x) for all x suff- 
ciently close to xo. It follows (see Volume I, p. 241) that ¢’(xo) = 0; 
that is, f:(xo, yo) = 0. The second necessary condition f,(xo, yo) = 0 
is derived similarly. 

Geometrically, the vanishing of the partial derivatives of f(x, y) 
at the point (xo, yo) means that at the point (xo, yo, f(xo, yo)) the tangent 
plane to the surface z = f(x, y) is parallel to the x, y-plane. We call 
(xo, yo) a stationary or critical point of f(x, y) if the first derivatives 
fx(xo, yo), fy(xo, yo) both exist and vanish. Hence, every relative ex- 
tremum in the interior of the domain of a differentiable function f is 
a critical point of f. 


The same result applies to functions f(x, y, z,. . .) of any number 
of independent variables. Here (xo, yo, 20,...) is a stationary or 
critical point of f if all first derivatives fz, fy, . . . at that point exist 


and satisfy 


(67b) fx(x0, Yo, Zo, . . .) = 0, fy(x0, yo, 20,.. .) = 9, 
fA xo, Yo, ZO» - .) = 0, 2 8 e 
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The number of conditions is equal to that of independent variables 
x,y, Z. . .. We can combine the conditions into the single require- 
ment that 


df=fzdx+fydy+fzedz+---=0 


for (x, y, Z, . . .) = (Xo, Yo, Zo, - . -) and all dx, dy, dz,.... 

Since the number of equations (67b) is the same as the number of 
unknowns Xp, Yo, Zo, - - - one usually expects to find a finite number of 
critical points, though, of course, that is not always so. Moreover, a 
critical point need not by any means be a relative extremum. 

Consider, for example, the function u = xy. Our two equations (67a) 
at once give the point x = 0, y = 0 as the only critical point. In every 
neighborhood of (0, 0), however, the function may assume either 
positive or negative values, depending on the quadrant containing 
(x, y). The function therefore has no relative extremum at this point. 
The surface representing the function u = xy geometrically is a hyper- 
bolic paraboloid that has neither a highest nor lowest point, but has a 
saddle point at the origin (see Fig. 3.1). 

We see that the maximum and minimum points of a differentiable 
function either lie on the boundary of the domain of the function or 
are to be looked for among the critical points of the function. To 
decide whether a critical point actually is a maximum or minimum 
requires a special investigation. On p. 349 we shall meet conditions 
that are sufficient to ensure that a critical point be at least a relative 
extremum. 

The maximum value M of a function f(x, y) is the greatest of all 
values assumed by f at the points of its domain R. The maximum 
points of f are those for which f(x, y) = M.t Similarly, the critical 
or stationary values of f are those assumed at critical or stationary 
points. 


b. Examples 
1. The function 
u= VI- Zy (x? +y? <1) 


has the partial derivatives 


1Sometimes the term “maximum” is used somewhat ambiguously referring either to 
the maximum value or an argument point (x, y) where f assumes its maximum value. 


828 Introduction to Calculus and Analysis, Vol. II 


n=- =Š pa l 
"o VE em yh VT = ey 


and these vanish at the origin. Here we have a maximun, for at all 
other points (x, y) in the neighborhood of the origin the quantity 
1 — x? — y? under the square root is less than it is at the origin. 

2. We wish to construct the triangle for which the product of the 
sines of the three angles is greatest; that is, we wish to find the 
maximum of the function 


f(x,y) = sin x sin y sin (x + y) 


in theregionO<x<7,0SyS7,0S5x+¥4yX17.Sincefis positive 
in the interior of this region, its greatest value is positive. On the 
boundary of the region, where the equality sign holds in at least one 
of the inequalities defining the region, we have f(x, y) = 0, so that 
the greatest value must lie in the interior. 

If we equate the derivatives to 0, we obtain the two equations 


cos x sin y sin (x + y) + sin x sin y cos (x + y) = 0, 
sin x cos y sin (x + y) + sin x sin y cos (x + y) = 0. 


SinceO<x<21,0<y<71,0<x+y< 1, these give tan x = tan y, 
or x = y. If we substitute this value in the first equation, we obtain 
the relation sin 3x = 0; hence, x = 2/3, y = 1/8 1s the only stationary 
point, and the required triangle is equilateral. 

3. Three points Pi, P2, Ps, with coordinates (x1, yı), (x2, y2), and 
(xs, ys), respectively, are the vertices of an acute-angled triangle. We 
wish to find a fourth point P with coordinates (x, y)such that the sum of 
its distances from P1, P2, and P3 is the least possible. This sum of dis- 
tances is a continuous function of x and y, and at some point P inside 
a large circle enclosing the triangle it has a least value. This point P 
cannot lie at a vertex of the triangle, for then the foot of the perpendi- 
cular from either of the other two vertices to its opposite side would 
give a smaller sum of distances. Again, P cannot lie on the circumfer- 
ence of the circle, if this is sufficiently far away from the triangle. With 
the distances r; defined by 


ri = V(x — x)? + (y — yi)? 
we wish to minimize the function 


f(x,y) = rı + r2 + rs, 
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which is differentiable everywhere except at Pi, P2, and P3, We know 
that at the point P the partial derivatives with respect to x and y must 
vanish. Thus, by differentiating f, we obtain the conditions 


xX — X1 xX — X2 Xx — X3 


+ ——=+-—==0, 
rı r2 r3 


YN, Y3, IY—IY3_g 
rı r2 r3 


for P. According to these equations, the three plane vectors 


? 9 b 


[== 2—3) = 22 —=3) (= ad 
, ro re , 


ri rı r3 r3 


have the vector sum 0. Also, these vectors are each of unit length. 
When given the common initial point P, their end points form an equi- 
lateral triangle; that is, each vector is brought into the direction of 
the next by a rotation through r (Fig. 3.27). Since these three vectors 
have the same directions as the three vectors from P to Pı, Pe, Ps, it 
follows that each of the three sides of the triangle must subtend the 
same angle r at the point P. 


P2 
P3 


Pı 


Figure 3.27 


Exercises 3.7b 


1. Find the stationary points of the following functions and state their 
nature: 
(a) f(x, y) = y?(sin x — x/2) 
(b) f(x, y) = cos (x + y) + sin (x — y) 
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(c) f(x,y) =y" 

(d) f(x, y) = x/y 

(e) f(x, y) = ye”. 
2. Determine the maxima and minima of the function 

(ax? + by?)e—22-¥2 (0<a< b). 
3. Find the values of x, y which make 
2x? + (x — y}? — 6y 

stationary. 
4. The sum of the lengths of the 12 edges of a rectangular block is a; the 

sum of the areas of the 6 faces is a?/25. Calculate the lengths of the edges 


when the excess of the volume of the block over that of a cube whose 
edge is equal to the least edge of the block is greatest. 


5. Find the stationary points and state their nature, for the function 
1 2 
f(x, y, z) = x*(y — (2 + 3} . 


6. According to present postal regulations in the United States, a rectangu- 
lar parcel with side lengths x, y, z inches with x < y < z may be shipped 
only if 2(x + y) + z < 100. Find the maximum volume of a shippable 
parcel under this condition. [Hint. set z = 100 — 2(x + y).] 


7. Minimize the sum of the squared distances of a point X from n given 
points. 


c. Maxima and Minima with Subsidiary Conditions 


The problem of determining the maxima and minima of functions of 
several variables frequently presents itself in a different form. For 
example, we may wish to find the point of a given surface g(x, y, z) = 0 
closest to the origin. We then have to minimize the function 


f(x, y, 2) = Vx? + y? + Z, 


where the quantities x, y, z however, are no longer three independent 
variables but are connected by the equation of the surface g(x, y, z) = 0 
as a subsidiary condition. Such maxima and minima with subsidiary 
conditions do not, indeed, represent a fundamentally new problem. 
Thus in our example we only need solve for one of the variables, say 
z, as a function of the other two, to reduce the problem to that of 
determining the stationary values of a function of the two independent 
variables x, y. 

It is, however, more convenient, and also more elegant, to express 
the conditions for a stationary value in a symmetrical form, in which 
no preference is given to any one of the variables. 
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A simple typical case is presented by the problem of finding the 
stationary values of a function f(x, y) when the two variables x, y are 
not mutually independent but are connected by a subsidiary condition 


d(x, y) = 0. 


In order to gain geometric insight, we assume first that the subsidi- 
ary condition is represented, as in Fig. 3.28, by a curve in the x, y- 
plane without singularities and that, in addition, the family of curves 
f(x, y) = c = constant covers a portion of the plane, as in the figure. 


Z—— > 


p=0 


e 


Figure 3.28 Extreme value of f with subsidiary 
condition ¢ = 0. 


Among the curves of the family that intersect the curve ¢ = 0, we 
have to find that one for which the constant c is greatest or least. As 
we describe the curve ¢ = 0, we cross the curves f(x, y) = c, and in 
general c changes monotonically; at the point where the sense in 
which we run through the c-scale is reversed, we may expect an 
extreme value. From Fig. 3.28 we see that this occurs for the curve of 
the family that touches the curve ¢ = 0. The coordinates of the point 
of contact will be the required values x = &, y = n corresponding to 
the extreme value of f(x, y). If the two curves f = constant and ¢ = 0 
touch, they have the same tangent. Thus, at the point x = 6, y = 0, 
the proportional relation 


fe :fy = 2: by 


holds; or, if we introduce the constant of proportionality A, the two 
equations 


fz + 162 = 0 
fy + Ady = 0 
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are satisfied. These, with the equation 


g(x, y) = 0, 


serve to determine the coordinates (&, n) of the point of contact and 
also the constant of proportionality À. 

This argument may fail, for example, when the curve ¢ = 0 has 
singular point, say a cusp as in Fig. 3.29, at the point (E, n) at which 
it meets a curve f = c with the greatest or least possible c. In this case, 
however, we have both 


bx(S, n) = 0 and dy(E, n) = Q. 


fu 
J È 


/ | 
L] 
J | 
Z 


P=0 


Figure 3.29 Extreme value at a singular point of ¢ = 0 


We are led intuitively to the following rule, which we shall prove 
in the next subsection: 


In order that an extreme value of the function fx, y) with the subsidi- 
ary condition ¢(x, y) = 0, may occur at the point x = č, y = n, where 
g2(£, n) and ¢,(é, n) do not both vanish, there must be a constant of 
proportionality à such that the two equations 


(67c) fx(E,n) + ¥XAE,n)=0 and falé, n) + àg, n) = 0 


are satisfied together with the equation 


(67d) a(S, n) = 0. 


This rule is known as Lagrange’s method of undetermined multipliers, 
and the factor à is known as Lagrange’s multiplier. 
We observe that this rule gives as many equations for the deter- 
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mination of the quantities €, n, and À as there are unknowns. We 
have, therefore, replaced the problem of finding the positions of the 
extreme values (€, n) by a problem in which there is an additional 
unknown A but in which we have the advantage of complete sym- 
metry. Lagrange’s rule is usually expressed as follows: 


To find the extreme values of the function f(x, y) subject to the sub- 
sidiary condition (x, y) = 0, we add to f(x, y) the product of g(x, y) 
and an unknown factor 4 independent of x and y and write down the 
known necessary conditions, 


fz + Adz = 0, fy + Ady = 0, 


for an extreme value of F = f + Ad. In conjunction with the subsidiary 
condition ø = 0 these serve to determine the coordinates of the 
extremum and the constant of proportionality. 

As an example, we find the extreme values of the function 


u = xy 


on the circle with unit radius and center at the origin, that is, with 
the subsidiary condition 


x?+y?—1=0. 


According to our rule, by differentiating xy + A(x? + y? — 1) with 
respect to x and to y, we find that at the stationary points the two 
equations 


y+ 20x = 0 
x + 2hy = 0 
have to be satisfied. In addition we have the subsidiary condition 
xX +y —1=0. 


On solving, we obtain the four points 


l = l 
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The first two of these give a maximum value u = 4, and the second 
two, a minimum value u = —4, of the function u = xy. That the first 
two do really give the greatest value and the second two the least 
value of the function u follows from the fact that on the circumfer- 
ence the function must assume a greatest and a least value (cf. p. 325), 
since the circumference is closed and bounded. 


Exercises 3.7c 


1. Solve Exercise 6 of Section 3.7b as a problem in maximizing the volume 
subject to the condition 2(x + y) + z= 100. 

2. Minimize the function z = x?y? subject to the condition x + y = 1. 

3. Maximize the function z= cos v(x + y) subject to the condition 
x? + y2 = 1, 

4. In the plane, minimize the sum of the squared distances of a point X 
from n given points subject to the condition that X lie on a given line 
(compare Section 3.7b, Exercise 7). 


5. If C= f(a, b) is a true maximum or minimum of f(x, y) subject to the 
condition ¢(x, y) = C’, show that in general C’ = ¢(a, b) is a true maxi- 
mum or minimum of ¢(x, y) subject to the condition f(x, y) = C. 


d. Proof of the Method of Undetermined Multipliers in the 
Simplest Case 


As we should expect, we arrive at an analytical proof of the method 
of undetermined multipliers by reducing it to the known case of “free” 
extreme values. We assume that at an extremum point the two partial 
derivatives ¢:(E, n) and ¢,(E, n) do not both vanish; to be specific, we 
assume that ¢,(€, n) #0. Then, by the implicit function theorem 
(p. 221), in a neighborhood of this point the equation ¢(x, y) = 0 deter- 
mines y uniquely as a continuously differentiable function of x, say 
y = g(x). If we substitute this expression in f(x, y), the function 


f(x, g(x)) 


must have a free extreme value at the point x = € For this the 
equation 


f'(x) = fe + fyg'(x) = 0 
must hold at x = &. In addition, the implicitly defined function 
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y = g(x) satisfies the relation øz + ¢yg’(x) = 0 identically. If we 
multiply this equation by à = —fy/¢y and add it to fz + fyg’(x) = 0, we 
obtain 


fz + \bz = 0, 
and by the definition of A, the equation 
fy + Apy = 0 


holds. This establishes the method of undetermined multipliers. 

This proof brings out the importance of the assumption that the 
derivatives øz and ¢y do not both vanish at the point (€, n). If both 
derivatives vanish the rule breaks down, as the following example 
shows. We wish to make the function 


f(x,y) = x + y? 
a minimum, subject to the condition 
A(x, y) = (x — 1}? — y? = 0. 


In Fig. 3.30 the shortest distance from the origin to the curve (x — 1)? 
— y2? = 0 is obviously given by the line joining the origin to the cusp 
S of the curve (we can easily prove that the unit circle centered at the 
origin contains no other point of the curve). The coordinates of S— 


Figure 3.30 The curve (x — 1)? — y2 = 0. 
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that is, x = 1 and y = 0—satisfy the equations g(x, y) = 0 and fy + 
Ady = 0 no matter what value is assigned to À, but 


fe + Abe = 2x + 8Mx — 1)? = 2 + 0. 


We can state the method of undetermined multipliers in a slightly 
different way that is particularly convenient for generalization. We 
have seen that the vanishing of the differential of a function F(x, y) 
at a given point is a necessary condition for the occurrence of a free 
extreme value of the function at that point. For the present problem 
we can similarly make the following statement: 


In order for the function f(x, y) to have an extreme value at the point 
(é, n) subject to the subsidiary condition d(x, y) = 0, the differential df 
must vanish at that point, where we consider the differentials dx and dy 
to be not independent but subject to the equation 


(67e) dé = ġz dx + ¢dydy = 0 


deduced from ¢ = 0. Assume that at the point (€, n) the differentials 
dx and dy satisfy the equation 


(67£) df = fx(E, n) dx + fy(E, n) dy = 0 


whenever they satisfy the equation dé = 0. Multiplying equation (67e) 
by a number à and adding to (67f), we obtain 


(fz + px) dx + (fy + Ady) dy = 0. 
If we determine A so that 
(67g) fy + Ady = 0, 


as is possible in virtue of the assumption that øy + 0, it follows that 
(fz + gz) dx = 0, and since the differential dx in (67e) can be chosen 
arbitrarily, say, equal to 1, we have 


Conversely, relations (67g, h) with any A imply, of course, that df = 0 
whenever d¢ = 0. 
Exercises 3.7d 


1. Describe the appearance of the surface z = f(x, y) + A(x, y), for à the 
Lagrange multiplier and ¢ = 0 the constraining equation. 
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e. Generalization of the Method of Undetermined Multipliers 


We can extend the method of undetermined multipliers to a greater 
number of variables and also to a greater number of subsidiary con- 
ditions. We shall consider a special case that includes every essential 
feature. We seek the extreme values of the function 


(68a) u = f(x, y, z, t), 
when the four variables x, y, z, t satisfy the two subsidiary conditions 
(68b) glx, y, z, t) = 0, y(x, y, z, t) = 0. 


We assume that at the point (E, n,C, t) the function f takes a value that 
is an extreme value when compared with the values at all 
neighboring points satisfying the subsidiary conditions. We require 
that, in the neighborhood of the point P = (6, n,%, t) two of the 
variables, say z and t, can be represented as functions of the other 
two, x and y, by means of the equations (68b). To ensure that such 
solutions z = g(x, y) and t = A(x, y) can be found, we assume that at 
the point P the Jacobian 


(68c) A % = Ot — Jiz 


is not zero (cf. p. 265). We now substitute the functions 
z= g(x,y) and t=A(x,y) 


in the function u = f(x, y, z, t), to obtain a function of the two indepen- 
dent variables x and y, and this function must have a free extreme 
value at the point x = &, y = n; that is, its two partial derivatives 
must vanish at that point. The two equations 


(69a) fet fe Z + AZ =0 
(69b) fy + f. s+ fis = =0 
must therefore hold. In order to calculate from the subsidiary condi- 
tions the four derivatives z a t, s é occurring here, we could 


write down the two pairs of equations 
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ð ð 

(69c) dz + be 5 + ot = = 0, 

dz ð 
(69d) Wa + Yz z ax + Wi -= 0 
and 
(69e) Øy + pZ s+ F = 0, 

dz ot 
(69f) Wy + W257 ay + Wiz ay = 0 
and solve them for the unknowns 02/dx, . . ., dt/dy; this is possible 


because the Jacobian d(¢,y)/d(z,t) does not vanish. Thus, the prob- 
lem would be solved. 

Instead, we prefer to retain formal symmetry by proceeding as 
follows. We determine two numbers A and p in such a way that the 
two equations 


(70a) fe + \bz + yz = O, 
(70b) fi + Adi + py = O 
are satisfied at the point where the extreme value occurs. The deter- 
mination of these multipliers à and p is possible, since we have as- 
sumed that the Jacobian d(¢,w)/d(z,t) is not zero. If we multiply the 


equations (69c, d) by à and p, respectively, and add them to the equation 
(69a), we have 


Hence, by the definition (70a, b) of à and y, 
fz + bx + UW, = 0. 


Similarly, if we multiply the equations (69e, f) by 4 and p, respectively, 
and add them to the equation (69b), we obtain the further equation 


fy + Ady + bWy = 0. 


We thus arrive at the following result: If the point (€, n, Ç, t) is an ex- 
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tremum of f(x, y, z, t) subject to the subsidiary conditions 
(71a) a(x, y, z, t) = 0, 
(71b) w(x, y, z, t) = 0, 


and if at that point d(¢,w)/d(z,t) is not zero, then two numbers A and yu 
exist such that at the point (€,n,¢, 1) the equations 


(72a) fz + Mba + UYs = 0, 
(72b) fy + Ady + UWy = 0, 
(72c) fe + àge + yz = 0, 
(72d) fi + Ade + uye = 0, 


and the subsidiary conditions (71a, b) are satisfied. 

These last conditions are perfectly symmetrical. Every trace of 
special emphasis on the two variables x and y has disappeared from 
them, and we should equally well have obtained (72a, b, c, d) if, instead 
of assuming that 0(¢, y)/3(z, t) + 0, we had merely assumed that any 
one of the Jacobians 0(¢, w)/0(x, y), olg, v)/0(x, z), . . ., a(g, w)/A(z, © did 
not vanish, so that in the neighborhood of the point in question a 
certain pair of the quantities x, y, z, t (not necessarily z and t) could 
be expressed in terms of the other pair. For this symmetry of our 
equations we have of course paid a price; in addition to the unknowns 
ÉE, n, 6,17, we now have A and y also. Thus, instead of four unknowns, 
we now have six, determined by the six equations above. 

In exactly the same way, we can state and prove the method of 
undetermined multipliers for an arbitrary number of variables and an 
arbitrary number of subsidiary conditions. The general rule is as 
follows: 


If in a function 
u = f(x, N o P Xn) 
the n variables x1, x2, . . ., Xn are not independent but are connected by 
the m subsidiary conditions (m < n) 
ilxı, X2,.. ., Xn) = 0, 


b2(X1, X2,.. ., Xn) = 0, 


bm(X1, x2, ee e Xn) = 0, 
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then we introduce m multipliers 1, 42, . . ., Am and equate the deriva- 
tives of the function 


F = f + digi + Aop2 + © © © + Amdbm 


with respect to x1, x2, . . ., Xn, when d1, A2, . . ., Am are constant, to 0. 
The equations 


9 . . e 


thus obtained,| together with the m subsidiary conditions 
gi. = 0, . -, 9m = 0, 


represent a system of m + n equations for the m + n unknown quanti- 
ties x1, X2, . . ., Xn, Àl, . . .,Am. These equations must be satisfied at any 
extreme point of f unless every one of the Jacobians of the m functions 
$1, $2, . . ., $m With respect to m of the variables xı, . . ., xn has the value 
0. 


We observe that this rule gives us an elegant formal method for 
determining the points where extreme values occur; however, it 
merely constitutes a necessary condition. It still remains to investi- 
gate the circumstances under which the points that we find by means 
of the multiplier method actually correspond to a maximum or a mini- 
mum of the function. Into this question we shall not enter; its dis- 
cussion would lead us too far afield. As in the case of free extreme 
values, when we apply the method of undetermined multipliers we 
usually know beforehand that an extremum in the interior of the 
domain of f does exist. If the method determines the point uniquely 
and the exceptional case (all the Jacobians 0) does not occur anywhere 
in the region under discussion, then we can be sure that we have 
really found the point where the extreme value occurs. 


Exercises 3.7e 


1. Interpret the problem of minimizing u = f(x, y, z) subject to the con- 
straint ¢(x, y, z) = 0 geometrically, 


2. Give an example of a problem of the form: Extremize f(x, y, z) subject to 
the constraints ¢(x, y) = 0, (y, z) = 0. Interpret this geometrically. 


f. Examples 


1. As a first example we attempt to find the maximum of the 
function f(x, y, z) = x2y222 subject to the subsidiary condition x? + y? 


1Which are identical with those for a “free” extremum of the auxiliary function F. 
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+ z% = c2, On the spherical surface x? + y? + 2? = c?, the function 
must assume a greatest value, since the surface is a bounded and 
closed set. According to the rule, we form the expression 


F = xyz + M(x? + y? + 2? — c?) 
and by differentiation obtain 
2xy*z? + 20x = 0, 
2x2yz? + 2rAy = 0, 
2x*y2z + 2Az = 0. 


The solutions with x = 0, y = 0, or z = 0 can be excluded, for at these 
points the function f takes on its least value, zero. The other solutions 
of the equation are x? = y? = z?, A = — xt. Using the subsidiary con- 
dition, we obtain the values 

c KA c 


x= + 79> y= £773: z= 47) 


Go| 


for the required coordinates. 

At all these points, the function assumes the same value c®/27, 
which accordingly is the maximum. Hence, any triad of numbers 
satisfies the relation 


2 2 2 2 
yya <G = MHE, 
which states that the geometric mean of three nonnegative numbers 
x?, y?, z? is never greater than their arithmetic mean. 

One proves similarly for any arbitrary number of positive numbers 
that the geometric mean never exceeds the arithmetic mean.! 

2. As a second example we shall seek to find the triangle (with 
sides x, y, 2) with given perimeter 2s, and the greatest possible area. 
By the well-known formula of Heron the square of the area is given by 


f(x, J, 2) = s(s 7 x)(s T y)(s 7 2). 


We therefore seek the maximum of this function subject to the sub- 
sidiary condition 


1For another proof, see Volume I, Problem 13, p. 109, or Problem 11, p. 318. 
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g=x+y+2-—2s=—0, 
where x, y, z are restricted by the inequalities 
x20,yjy20,z220,x+y22z,x+22y,y+22”%. 


On the boundary of this closed region (i.e., whenever one of these 
inequalities becomes an equation), we always have f= 0. Con- 
sequently, the greatest value of f occurs in the interior and is a 
maximum. We form the function 


F(x, y, z) = s(s — x)(s — y\(s — z) + Mx + y + z — Qs), 
and by differentiation obtain the three conditions 
—s(s — y(s —z)+A=0, —s(s — x)\(s — z) + à = 0, 
—s(s — xs — y)+ à = 0. 
By equating the three expressions we obtain x = y = z = 2s/3; that is, 


the solution is an equilateral triangle. 
3. We next prove the inequality 


1 1 
< co ya — pB 
(73a) uv Sou + gu 


for every u > 0, v= O and every a > 0,8 > 0 for which 1/a + 1/6 = 1. 

The inequality is certainly valid if either u or v vanishes. We may 
therefore restrict ourselves to values of u and v such that uv + 0. If 
the inequality holds for a pair of numbers u, v, it also holds for all 
numbers ut!/<, vt!/B where t is an arbitrary positive number. We need 
therefore consider only values of u, v for which uv = 1. Hence, we 
have to show that the inequality 


res Pe 
a B 


holds for all positive numbers u, v such that wu = 1. 
To do this, we solve the problem of finding the minimum of 


1 1 
Z ya = yB 
o + RY 


subject to the subsidiary condition uv = 1. This minimum obviously 
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exists and occurs at a point (u, v) where u # 0, v # 0. Consequently, 
there exists a multiplier —à for which we have 


ul — v = 0 and vB-l — Au = Q. 


On multiplication by u and v, respectively, these equations at once 
yield ue = A, uê = à. Taken with uv = 1, the last results imply that 
= v = 1. The minimum value of 


Lotte 
a 


B 
is, therefore, 1/a + 1/8 = 1. That is, the statement that 


Lop tS) 
a B 


when uv = 1 is proved. 
If in the inequality (73a) we replace u and v by 


n lla n 1/6 
u = ul (3: ue) and v = vil (3: vs) , 
i=1 i=] 


respectively, where wi, U2, . . ., Un, U1, U2, . . ., Un are arbitrary non- 
negative numbers and at least one u and at least one v is not zero 
and if we sum over i = 1,.. ., n, we obtain Hölder’s inequality 


(73b) 3 UiVi S È we) (S vs) 


This holds for any 2n numbers u; vı where uw = 0, v: 20 (i = 1, 2, 
. ., n); not all the u’s and not all the v’s are zero; and the indices 
a, B are such thata > 0, B > 0,1/a + 1/B = 1. The Cauchy-Schwarz 
inequality is the special case a = B = 2 of Holder’s inequality. 
4, Finally, we seek the point on the closed surface 


g(x,y, z) = 0 


that is at the least distance from the fixed point (E, n, ¢). If the distance 
is a minimum its square is also a minimum; we accordingly consider 
the function 


F(x, y, 2) = (x ~~ 6)? + (y T n)? + (z T 0)? + A(x, Y, z). 


Differentiation gives the conditions 


844 Introduction to Calculus and Analysis, Vol. II 
2(x — €) + Adz = 0, Ay — n) + Ady = 0, 2(z — €) + Agz = 0, 


or, in another form, 


These equations state that the fixed point (€,17,¢) lies on the normal 
to the surface at the point of extreme distance (x, y, z). Therefore, 
in order to travel along the shortest path from a point to a (differ- 
entiable) surface, we must travel in a direction normal to the surface. 
Of course, further discussion is required to decide whether we have 
found a maximum or a minimum or neither. Consider, for example, a 
point within a spherical surface. The points of extreme distance lie at 
the ends of the diameter through the point; the distance to one of 
these points is a minimum, to the other a maximum. 


Exercises 3.7f 


1. Find the shortest distance between the plane Ax + By + Cz = D and 
the point (a, b, c). 
2. Find the greatest and least distances of a point on the ellipse x?/4 + y?/1 
= 1 from the straight line x + y — 4 = 0. 
3. Show that the maximum value of the expression 
ax? + 2bxy + cy? 2 
Ai —f?>0 
ox? + Ofcy + gy? eg —f ) 
is equal to the greater of the roots of the equation in A 
(ac — b3) — ag — 2bf + ec) + X?(ea — f?) = 0. 
4, Calculate the maximum values of the following expressions: 
x2 + 6xy + 3y? 
x? — xy + y? 


x4 + 2x3y 
(b) x4 + y$ ° 


(a) 


5. Find the values of a and b for the ellipse x?/a? + y?/b? = 1 of least 
area containing the circle (x — 1)? + y? = 1 in its interior. 

6. Which point of the sphere x? + y? + 22? = 1 is at the greatest distance 
from the point (1, 2, 3)? 

7. Find the point (x, y, z) of the ellipsoid x2/a? + y?/b2 + 22/c? = 1 for which 
(a) A+B+C 
(b) VA? + B? + C2, 
is a minimum, where A, B, C denote the intercepts that the tangent 
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plane at (x, y, z), where x > 0, y > 0, z > 0, makes on the coordinate axes. 
8. Find the rectangular parallelepiped of greatest volume inscribed in the 
ellipsoid x2/a2 + y?/b? + z2?/c? = 1. 
9. Find the rectangle of greatest perimeter inscribed in the ellipse x?/a? + 
y?/b2 = 1. 
10. Find the point of the ellipse 5x? — 6xy + 5y? = 4 for which the tangent 
is at the greatest distance from the origin. 
11. Prove that the length | of the greatest axis of the ellipsoid 


ax? + by? + cz? + 2dxy + 2exz + 2fyz = 1 
is given by the greatest real root of the equation 


a-y d e 
d b- 5 

1 
e f C— p 


12. (a) Maximize x°% y? 2°, wherea, b, c are positive constants, subject to the 
condition x* + y* + z* = 1 where x, y, z are nonnegative and k > 0. 


(b) From the result of part (a) derive the inequality for any six positive 
real numbers 


a) e e) serere 

— — — < ————_ 

a} \b) œc] Z\ia+b+c 

13. Let P1P2P3P4 be a convex quadrilateral. Find the point O for which the 


sum of the distances from Pı, Pe, Ps, P4 is a minimum. 


14, Find the quadrilateral with given edges a, b, c, d that includes the 
greatest area. 


Appendix 


A.1 Sufficient Conditions for Extreme Values 


In the theory of maxima and minima in the preceding chapter we 
contented ourselves with finding necessary conditions for the occur- 
rence of an extreme value. In many cases occurring in actual practice 
the nature of the “stationary” point thus found can be determined 
from the special nature of the problem, permitting us to decide 
whether it is a maximum or a minimum. Yet it is important to have 
general sufficient conditions for the occurrence of relative extrema. 
Such criteria will be developed here for the typical case of two in- 
dependent variables. 

If we consider a point (xo, yo) at which the function is stationary, 
that is, a point at which both first partial derivatives of the function 


846 Introduction to Calculus and Analysis, Vol. II 


vanish, an extreme value occurs if and only if the expression 


f(xo + h, Yo + k) — f(xo, Yo) 


has the same sign for all sufficiently small values of h and k. If we 
expand this expression by Taylor’s theorem with the remainder of the 
third order and use the equations f2(%o, Yo) = 0 and fy(xo, Yo) = 0, we 
obtain 


1 | 
f(xo + h, Yo + k) — f(X0, Yo) = gh faz + 2hkfzy + R*fyy) + €P°, 


where p? = h? + k? and e tends to zero with p. 

This suggests that in a sufficiently small neighborhood of the point 
(Xo, Yo) the behavior of the functional difference f(x» + h, yo + k) — 
f(xo, Yo) is essentially determined by the expression 


QCh, k) = ah? + 2bhk + ck?, 
where for brevity we have put 


a = fexlXo, Yo), b = frylXo, Yo), € = fyy{Xo, Yo). 


In order to study the problem of extreme values we must investigate 
this homogeneous quadratic expression or quadratic form Qin A and 
k. We assume that the coefficients a, b, c do not all vanish. In the ex- 
ceptional case where they do all vanish, which we shall not consider, 
we must begin with a Taylor series extending to terms of higher order. 

With regard to the quadratic form Q there are three different 
possible cases: 


1. The form is definite. That is, when h and k assume all values, Q 
assumes values of one sign only and vanishes only for h = 0, k = 0. 
We say that the form is positive definite or negative definite according 
to whether this sign is positive or negative. For example, the ex- 
pression h? + k2, which we obtain when a = c = 1, b = 0, is positive 
definite while the expression — h? + 2hk — 2k? = —(h — k} — k? is 
negative definite. 

2. The form is indefinite. That is, it can assume values of different 
sign; for example, the form Q = 2hk, which has the value 2 for h = 1, 
k = 1 and the value —2 for h = —1, k = 1. 

3. The third possibility is that the form vanishes for values of h, 
k other than h = 0, k = 0, but otherwise assumes values of one sign 
only, for example, the form (h + k)?, which vanishes for all sets of 
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values h, k such that h = —k. Such forms are called semidefinite. 


The quadratic form Q = ah? + 2bhk + ck? is definite if and only if 
its discriminant ac — b? satisfies the condition 


ac — b? > 0; 


it is then positive definite if a > 0 (so that c > also); otherwise, it is 
negative definite. 

In order that the form may be indefinite, it is necessary and suff- 
cient that 


ac — b? < 0, 
while the semi-definite case is characterized by the equation! 
ac — b? = 0. 


We shall now prove the following statements. If the quadratic 
form Q(h, k) is positive definite, the stationary value assumed for 
h = 0, k = 0 is a relative minimum (even a strict relative minimum). 
If the form is negative definite, the stationary value is a relative 
maximum. If the form is indefinite, we have neither a maximum nor a 
minimum; the point is a saddle point. Thus, definite character of the 
form Q is a sufficient condition for an extreme value, while indefinite 
character of @ excludes the possibility of an extreme value. We shall 
not consider the semidefinite case, which leads to involved dis- 
cussions. 

In order to prove the first statement, we observe that if Q is a 
positive definite form, there is a positive number m independent of h 
and k such that? 


1These conditions are easily obtained as follows. Either a = c = 0, in which case we 
must have b Æ 0 and the form is, as already remarked, indefinite; the criterion there- 
fore holds for this case; otherwise, we must have, say, a #0. We can write 


ah? + 2bhk + ck? = al (h +a) + Re], 

This form is obviously definite if ca — b? > 0, and it then has the same sign as a. It 
is semidefinite if ca —b? = 0, for then it vanishes for all values of h, k that satisfy 
the equation h/k = —b/a, but for all other values it has the same sign. It is indefinite 
if ca — b? < 0, for it then assumes values of different sign when k vanishes and when 
h + (b/a)Rk vanishes. 

2To see this we consider the quotient Q(h, k)/(h? + k2) as a function of the two 
quantities u = h/Vh? + k? and v = k/Vh? + k?. Then u? + v? = 1, and the form 
becomes a continuous function of u and v, which must have a least value 2m on 
the circle u? + v? = 1. This value m obviously satisfies our conditions; it is not 
zero, for u and v never vanish simultaneously on the circle. 
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Q = 2m(h2 + k?) = 2mp?. 


Therefore, 
1 
f(xo + h, Yo + k) — f(xo, Y0) = 5 Oh, k) + Ep? = (m + €)p%. 


If we now choose p so small that the number £ is less in absolute value 
than 4m, we obviously have 


fxg + h, Yo + k) — feo, Yo) = 5 P? > 0. 


Thus, for this neighborhood of the point (xo, yo) the value of the 
function is everywhere greater than f(xo, Yo), except of course at (xo, 
yo) itself. In the same way, when the form is negative definite the 
point is a maximum. 

Finally, if the form is indefinite, there is a pair of values (hı, kı) 
for which Q is negative and another pair (he, k2) for which Q is po- 
sitive. We can therefore find a positive number m such that 


Q(hı, kı) < —2mp1?, 
O(he, k2) > 2mpe?. 


If we now put h = thi, k = thi, p? = k? + k?, (t + 0)—that is, if we 
consider a point (xo + h, yo + k) on the line joining (xo, yo) to (xo + hı, 
Yo + ki)—then from Q(h, k) = ?Q(hi, kı) and p? = #?p1? we have 


Qh, k) < —2mp?. 


Thus, by choice of a sufficiently small ¢ (and corresponding p), we can 
make the expression f(x» + h, yo + k) — f(xo, Yo) negative. We need 
only choose £ so small that for h = thi, k = tkı the absolute value of 
the quantity £ is less than +m. For such a set of values we have 
f (xo + h, Yo + k) — f(x0, Yo) < —mp?/2, so that the value f(xo + h, Yo + k) 
is less than the stationary value f(xo, yo). In the same way, on carry- 
ing out the corresponding process for the system h = the, k = tke, we 
find that in an arbitrarily small neighborhood of (xo, yo) there are 
points at which the value of the function is greater than f(xo, yo). Thus, 
we have neither a maximum nor a minimum but, instead, what we call 
a saddle value. 

If a = b = c = 0 at the stationary point, so that the quadratic 
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form vanishes identically, and in the semidefinite case, this discussion 
fails to apply. To obtain sufficient conditions for these cases would 
lead to involved distinctions. 

Thus, we have the following rule for distinguishing maxima and 
minima: 


At a point (xo, yo) where the partial derivatives vanish, 


felXo, Yo) = 0, fy(Xo, Yo) = 0 


and the inequality 


fuxfyy — fry" > 0 


holds, the function f has a relative extreme value. This is a relative 
maximum if fsx < 0 (and consequently fyy < 0), and a relative minimum 
if fuzx > 0. If, on the other hand, 


fuzfyy — fry’ < 0, 


the stationary value is neither a maximum nor a minimum. The case 


fesfyy —_ fy” = 0 


remains undecided. 

These conditions have a simple geometrical interpretation. The 
necessary conditions fz = fy = 0 state that the tangent plane to the 
surface z = f(x, y) is horizontal. If we really have an extreme value, 
then in the neighborhood of the point in question the tangent plane 
does not intersect the surface. In the case of a saddle point, on the 
contrary, the plane cuts the surface in a curve that has several 
branches at the point. This matter will be clearer after the discussion 
of singular points in section A.3. 

As an example we seek the extreme values of the function 


f(x, y) = x? + xy + y2? + ax + by. 

If we equate the first derivatives to 0, we obtain the equations 
2x+y+a=0, x+2y+b=0, 

which have the solution x = 4(b — 2a), y = 4(a — 2b). The expression 


fzafuy — fry” =3 
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is positive, as is fez = 2. The function therefore has a minimum at the 
point in question. 
The function 


f(x, y) = (y — x?) + x8 


has a stationary point at the origin. There the expression frzfyy — fry” 
vanishes, and our criterion fails. We readily see, however, that the 
function has no extreme value there, for in the neighborhood of the 
origin the function assumes both positive and negative values. 

On the other hand, the function 


f(x, y) = (x — y) + (y — 1) 


has a minimum at the point x = 1, y = 1, though the expression 
fzzfyy — fry? vanishes there. For 


fa+h,14+ k)-—f0,1) = (h — k) + kê, 


and this quantity is positive when p + 0. 


Exercises A.1 


1. Find and characterize the extreme values of the functions: 
(a) f(x, y) = x? — 3xy + y? 
(b) f(x, y) = cos (x + y) + sin (x — y) + x? 
(c) f(x, y) = x cosh y — y’. 

2. If da) = k + 0, $(a) + 0, and x, y, z satisfy the relation $(x)¢(y)¢(z) = 
k3, prove that the function f(x) + f(y) + f(z) has a maximum when 
x = y =z = a, provided that 


PN (Pa) AO — ev, 
Ma) a (a) ¢ (o) > fa) 


3. Let PıP2P; be a plane triangle with all three angles less than 120°. Prove 
by the criterion of p. 349 or of Exercise 6 below that at the point P interior 
to PiP2P3 such that 2 P2PP3 = 2 P3PPi1 = 2 PiPP2 = 120°, the sum 
PP, + PP2 + PP; is actually a minimum (cf. Example 3, p. 328). 

4, Where does the minimum of the sum PP: + PP: + PP} occur if in the 
triangle of Exercise 3 the angle P2P1P3 is greater than, or equal to, 120°? 

5. (a) Prove that if all the symbols denote positive quantities the stationary 

value of Ix + my + nz subject to condition x? + y? + z? = c? is 
c(l + mi + n%)1/2, where q = p/(p — 1). 

(b) Show that the value is a maximum or minimum according to whether 
pZzil. 


Developments and Applications of the Differential Calculus 351 


6. Generalize the investigation of Section A. 1 to functions of n variables, 
proving the following results. Let f(x1, . . ., xn) be three times continu- 
ously differentiable in the neighborhood of a stationary point x1 = x1°, 

.,Xn = Xn°, that is,a point where fz, = fro = fen = = 0. Consider the 


second total differential of f at the point x®, d?f® = x frisk” dxi dxx; this 


is a quadratic form in the variables dxı, . . ., dxn. if this quadratic form 
is nondegenerate, that is, if 


friz? e e o frizn? 
i) 


D= e ° +Æ 0, 


fanz? °. ° >œ fenrn® 


then d?f° may be (1) positive definite, (2) negative definite or (3) indefinite. 
Prove that these possible cases correspond respectively to the following 
properties of f at the point x°: (1) f has a minimum, (2) f has a maximum, 
(3) f has neither a minimum nor a maximum. 


7. To investigate stationary points of f = f(x1, . . ., Xn), where the variables 
satisfy the relations 
(1) $1(xX1,. . ., Xn) =0, .. ., mX. . .. Xn) = 0 (m< n) 


we may assume that we have found numerical values for the variables 
and the multipliers àu such that F = f + A1¢1 + © © > + Am¢m satisfies the 
equations 


(2) 4, = 0,...,57 =0, 
n 


and such that the Jacobian of ¢1, . . ., m with respect to the variables 
X1, . . ., Xm is not 0. To apply the criterion of Exercise 6 we may proceed 
as follows: Regarding xm+1, . . ., Xn as independent variables, by differ- 
entiating (1) we can obtain the first and second differentials of xı... 
Xm as functions of Xm+1, . . ., Xn and finally introduce these values into 


(3) d?f = x , fzizk dxi Axx + fry d2?x1 + +++ + fr d2xm. 
t, k= 


Prove the following second rule, not involving the computation of the 
second differentials d?xı, . . ., d?xm: Regarding xı, . . ., Xn as indépend- 
ent variables, consider 


d?F = 2 Frizk dxi dx. = df + r1d7¢1 + o oe © tìm dm; 
compute dxı, . . ., dXm from the equations 
dou = Gury dxı + ° © © + uzn dxn = 0 (=1,..., m) 


and introduce these values into d?F, thus obtaining a quadratic form 
§2F inthe variables dxm+1, . . ., dæn. If this quadratic form is nondegener- 
ate, then f has, respectively, a minimum, a maximum, or neither of these, 
according to whether 8?F is positive definite, negative definite, or 
indefinite. 
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8. In the problem of finding the maximum of f = x1x2 « + «xn subject to the 
condition ¢ = xı + x2 + ° » e + xn — a = 0 (a > 0), the rule of undeter- 
mined multipliers gives a stationary value of f at the point xı = x2 = 
e » e = Xn = ajn. Apply the rule of Exercise 7, instead of the consideration 
of the absolute maximum, to show that f has a maximum value at this 
point. 

9. Apply the criterion of Exercise 7, to prove that among all triangles of 
constant perimeter the equilateral triangle has the largest area (cf. 
p. 341). 


A.2 Numbers of Critical Points Related to Indices of a Vector 
Field 


A continuous function f(x, y) defined in a closed and bounded set 
R certainly has a maximum point and minimum point in R, by our 
fundamental theorem (see p. 112). If a maximum or minimum point 
(xo, yo) is an interior point of R and if f is a differentiable at (xo, yo), 
then (xo, yo) is a critical point of f. In some cases this observation per- 
mits us to deduce the existence of at least one critical point of f. For 
example, if the set R consists of an open, bounded set S and its bound- 
ary B and if f is constant on B and differentiable in S, then f has at 
least one critical point in S. This is just an extension of Rolle’s theorem 
(see Volume I. p. 175) to functions of several variables, and it is 
proved in the same way: The function f has maximum and minimum 
points. If these all lie on the boundary B where f is constant, then the 
maximum and minimum value of f coincide; then f is constant in S as 
well and every point of S is critical. Hence, there is at least one 
critical point of f in S. 

In the case of functions of a single independent variable, more 
specific information on the number of critical points of a certain type 
is available. Relative maxima and minima alternate (see Volume I, 
p. 239). Hence, the total numbers of relative maxima and of minima 
of a function in an interval differ by, at most, 1. This is not true for 
functions of two variables defined in a set R of the plane. There exists, 
however, an (intuitively less obvious) relation connecting the total 
numbers of relative extrema and of saddle points in the interior of R 
with the values of f on the boundary of R. In order to formulate this 
relation, we first have to consider the gradient field of f and to introduce 
the notion of index of a closed curve with respect to a vector field. 

Assume that f is continuous and has continuous first derivatives 
in the set R of the x, y-plane. Then f determines at each point of R the 
two quantities 


(74) u = fx(x,y), v= f(x, y). 
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These can be interpreted as the components of a certain vector, the 
gradient of f. The gradients at the various points of R form a vector 
field. The critical points of R are those where the gradient vanishes. 
At all other points, the gradient vector has a uniquely determined 
direction described, for example, by its direction cosines 


u 


Vu? + v2 and 


— U 
s= n= uF 


(see Volume I, p. 383). Clearly, € and n are continuous functions of 
(x, y) at every noncritical point of R. We can put 


E = cos ð, n = sin9, 


where, however, the angle 9—the inclination of the vector (u, v)—is 
determined only within whole multiples of 27. In general, it is not pos- 
sible to select one definite value for 9 that will then vary continu- 
ously with (x, y). On the other hand, the differential 


udu — v du 


(75) dð = d arc tan” = Paw 


_ (Wz — Ulz)dx + (uvy — Vuy)dy 
u? + v? 


is defined unambiguously for every noncritical point (x, y) of R. 

Now let C be an oriented closed curve that lies in R and does not 
pass through any critical point of f. We define the Poincaré index Ic 
of C with respect to the vector field as the number 


u dv — v du 
(76) Ic = n 2z upu 


If C is given parametrically by 


x = g(t), y= y(t) (a St <b), 


where ¢ and y have the same values at the two end points of the t- 
interval and where the orientation of C corresponds to the sense of 
increasing t, then the index of C is given by the integral 


1 f°; u dv’ v du 
lc = = | at 
° an |, u24 v dt u? 4 u dt) * 
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Since, after traversing the curve C, we return to the same point (x, y), 
the values for 0 corresponding to t = a and t = b can only differ by a 
multiple of 2r. Hence, Jc is always an integer. This integer counts the 
total number of counterclockwise rotations performed by the vector 
(u, v) as we go around the curve C in the sense indicated by its orien- 
tation.! Of course, Ic changes sign when we change the orientation of 
C. As an illustration, consider the function 


f(x,y) = x? + y. 
Here the gradient 
(u, v) = (2x, 2y) 


at any point (x, y) has the direction of the radius vector from the 
origin. Assume we make use of a right-handed coordinate system. For 
a closed curve C that does not pass through the origin the index, 


_ 1 xdy — y dx 
To = on f x? + y? 


measures the total number of counterclockwise turns performed by 
the radius from the origin in going around the curve C. This is exactly 
the formula for the number of times the curve C winds about the 
origin derived in Volume I (p. 434). 

Generally, at points where u and v do not both vanish, the differ- 
ential d@ of equation (75) satisfies the integrability condition 


UUs, — Vue _ a — UVUy 
u? + v? jy u? + v jr’ 


which can be verified directly and, of course, only reflects the relation 


(ere tan), = [fere tani) ]. 


which holds in spite of the possible multiple-valuedness of the function 
arc tan (v/u). It follows from the fundamental theorem on line integrals 
(see p. 104 and p. 97) that Ic = 0 if Cis the boundary of a simply con- 
nected subset of R that contains no critical points of f. 


1For the definition of “index” it is not necessary that the vector field be a gradient 
field. 
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More generally, consider a multiply connected set R with a number 
of closed boundary curves Ci, C2,..., Cn. Let the x, y-coordinate 
system be right-handed, as usual. Assume each C; is oriented in such 
a way that we leave R to our left in traversing C; in the sense cor- 
responding to its orientation. Assume that we can divide R into simply 
connected sets Rx by suitable auxiliary arcs joining various C; (cf. Fig. 
3.31). Let f have no critical points in R. Then, 


Figure 3.31 Multiply connected region with positively oriented boundary 
curves C; divided into simply connected sets. 


{do =0 


when extended over the boundary of any Rx traversed in the counter- 
clockwise sense. Forming the sum of the integrals over the boundaries 
of all the Rx, we see that the contributions from the auxiliary arcs 
cancel out (see p. 94) and we find that 


0= > f, 2. 
This means, however, that 
(77) 2 Io; = 0 
E 


if the C; are closed curves forming the boundary of a set R free of 
critical points of f, and with a sense of orientation leaving R to the 
left. 

As a consequence we obtain the theorem that there exists at least 
one critical point in R, whenever the sum of the indices of the boundary 
curves of R (oriented as explained) is different from zero. 
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More precise information on the number of critical points in R is 
obtained if we assume that f has continuous second derivatives in R, 
that f has only a finite number of critical points (x1, yi), . . .,(xw, YN), 
and that at each critical point the discriminant 


D= fesfyy — fay 


does not vanish. All critical points are then either relative maxima or 
minima corresponding to D > 0 or saddle points corresponding to 
D < 0(see p. 349). Assume that R again is bounded by oriented simple 
closed curves Ci, . . ., Cn that do not pass through any of the critical 
points of f. We can cut out a small neighborhood of each critical point 
(xx, yk) bounded by a curve yx. There remains a set bounded by the 
curves Ci, . . ., Cn, Y1, . . ., Yy that is free of critical points of f. Giving 
each yx the counterclockwise orientation, we have then, by (77), 


n N 
(78) X Ici — ily, = 0. 
1=1 k=1 


Now the index of one of the curves yx bounding a set containing a 
single critical point (xx, yx) just depends on the type of that point, as 
we shall show. 

Let yx be a small circle 


x= xk + rceost, y= yr +rsint 


of radius r and center at the critical point (xx, yx). By Taylor’s theorem, 
we have on Yk 


(79a) u = falx, y) = (x — x) faalxn, Ye) + (Y — Yyr)fzylxr, yr) + + °° 
= r(a cos t + b sin t) + O(r?) 


(79b) v = fylx, y) = (x — xx) fey (xn, ye) + (Y — Ye)\fuy(Xe, yr) + + °° 
r(b cos t + c sin t) + O(r?), 


where we put 
a = fza( Xk, Yk), b = fay(xk, yr), c= fuy(Xx, yr). 


In order to find out how often the vector (u, v) turns in the counter- 
clockwise sense as t varies from (0, 2r) we observe that the point in the 
plane with coordinates (u, v) (that is, the point whose position vector 
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has components u, v) approximately describes the ellipse E with para- 
metric representation 


(80) u=r(acost+ bsint), v = r(b cost + csin £). 


This ellipse has its center at the origin and has the nonparametric 
equation 


(cu — bv)? + (av — bu)? = r?(ac — b?)?. 


It is clear that the point (u, v) describes the ellipse E in (80) exactly 
once as ¢ increases from 0 to 2r, so that the index of yz certainly is 
either +1 or —1 depending onthecounterclockwise or clockwisesense 
of E corresponding to increasing t. Now the linear mapping 


u = r(au + bv), v = r(bu + cv) 
clearly takes the circle 
u = cos tł, v= sint 


in the u, v-plane (where increasing t correspond to the counterclock- 
wise sense on the circle) into E. Since sense of curves is preserved or 
inverted according to the sign of the Jacobian r?(ac — b?) of the 
mapping (see p. 260), we see that 


Ty, = sgn(ac — b?) = sgnf[fra(xxr, yYk)fyy(&r, yr) — fzy?’(xr, yr)] 
= sgn D(xr, yx).1 


It follows from (78) that 
n N 
24 Ic; = p> sgn D(xr, yx). 


As observed earlier sgn D(xx, yx) = +1 when the critical point (xx, yx) is 
either a relative maximum or minimum, and sgn D(xxz, yr) = —1, when 


1The same result can be obtained analytically by observing that, by formulae(79a, b), 
. . 1l u dv — v du 
lim ig = lim Qn f u? + v? 
_ 1 fx ac — b? di 
— 2n Jo (acost + b sin t)? + (b cost + csin t)?” 


The integral can be evaluated explicitly (see Volume I, p. 294) and has the value 
2r sgn (ac — b?). 
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it is a saddle point. Let Mo, Mi, M2 denote, respectively, the numbers 
of minima, saddle points, and maxima in R. Our result becomes the 
Poincaré identity.} 


Me 


(81) Ic; = Mo — Mi + Mz. 


a 
l 
pá 


In words, the excess of the number of relative maxima and minima of f 
in R over the number of saddle points equals the sum of the indices of 
the boundary curves of R with respect to the gradient field of f, where 
each boundary curve is oriented so as to leave R on the left-hand side. 

The result is particularly simple when f is constant along each 
boundary curve C; of R. The gradient vector of f then is perpendicular 
to C (see p. 233) and has the direction of either the exterior or the 
interior normal of C;. If no critical point of f lies on C; and C; is a 
smooth closed simple curve the direction of the gradient varies con- 
tinuously and cannot jump at any point of C; from that of exterior to 
that of interior normal or vice verse. It is clear then that the gradient 
vector turns exactly once along C;, and in the same sense as the 
tangent vector of C; with which the gradient forms a fixed angle. 
Thus, Ic; = +1 when Ci has the counterclockwise sense, and —1 when 
it has the clockwise one. It is easily seen that with our convention 
about the orientation of the boundary curves of R a boundary curve 
Ci has counterclockwise orientation when it forms the “outer” bound- 
ary of one of the disconnected pieces making up R and has clockwise 
orientation if it bounds one of the “holes” in R (see Fig. 3.31). It 
follows that for f constant on the boundary curves 


(82) Mo — Mı + Mz = No — Mi, 


where No is the number of connected components of R and N is the 
total number of holes in È (the “connectivity” of R). 

Take, for example, the case where R is a circular disc. Here No = 1, 
Ni = 0, and thus, for f constant on the boundary, 


Mo — Mi + Me = 1. 


We find here that the total number of critical points in the interior of R 
1s 


My + Mi + M: =1 + 2M, 


1The corresponding formulae for functions of more than two independent variables 
are those of M. Morse. 
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and, hence, certainly is an odd number. Moreover, if the number Mo + 
Mz of relative extrema of f exceeds 1, then f has at least one saddle point 
in R. 


For a circular ring R we have 
No=1, M=1, 
and thus, for f constant on each boundary curve, 
Mo — Mi + Me = 0. 


Take the case where f has the same constant value on each of the 
two boundary curves. Then fis either constant everywhere or assumes 
its maximum or minimum in the interior of R. If we postulate that f 
has only critical points with frzfyy — fry? # 0 the case of constant f is 
excluded. It follows then that Mo + M2 > 0 and, hence, that Mı > 0. 
Hence, a function in a circular ring that vanishes everywhere on the 
boundary has at least one critical point with frzfyy — fry? < 0 in the 
interior. 


Exercises A.2 


1. Give an example of a continuous function f that has a singularity at the 
origin of index 


(a) —1; 
(b) —2; 
(c) —n, where n is a natural number. 


2. Give an example of a function f, not required to be continuous, which 
has a singularity at the origin of index 


(a) 2; 
(b) n, where n is a natural number. 


3. Let the closed convex region R in the x, y-plane be bounded by a closed 
convex curve C with continuously turning tangent. Let 
E = f(x,y), 1 = 8(x, y) 


be a continuously differentiable mapping of R into itself. Prove that the 
mapping has at least one “fixed point” in R, that is, that there exists a 
point (x, y) in R such that 


x = f(x,y), y = g(x,y). 


The analogous fixed point theorem in n dimensions is due to Brouwer. 
[Hint. Consider the field of vectors with components u = f(x, y) — x, 


v = g(x, y) — y.] 
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A.3 Singular Points of Plane Curves 


On p. 236 we saw that a curve f(x, y) = Oin general has a singularity 
at a point x = xo, y = yo such that the three equations 


f(xo, yo) = 0, fa(xo, yo) = 0, fylxo, yo) = 0 


hold. In order to study these singular points systematically, we as- 
sume that in the neighbourhood of (xo, yo) the function f(x, y) has 
continuous derivatives up to the second order and that at that point 
the second derivatives do not all vanish. By expanding in a Taylor 
series up to terms of second order, we obtain the equation of the 
curve in the form 


2f(x, y) = (x — X0)*fzx(x0, yo) + 2(x — xoy — yo)fzy(xXo, Yo) 
+ (y —_ yo) *fyy(xo, yo) + ep? = 0, 


where we have put p? = (x — xo)? + (y — yo)? and e tends to 0 with p. 
Using a parameter t, we can write the equation of the general 
straight line through the point (xo, yo) in the form 


x — xo =at, y—yo= bt, 


where a and b are two arbitrary constants that we may suppose to be 
so chosen that a? + b? = 1. To determine the point of intersection of 
this line with the curve f(x, y) = 0, we substitute these expressions in 
the above expansion for f(x, y). For the point of intersection, we thus 
obtain the equation 


a?t?frz + 2abt?fry + b?t?fyy + et? = 0. 


A first solution is ¢ = 0, that is, the point (xo, yo) itself, as is obvious. 
However, it is noteworthy that the left-hand side of the equation is 
divisible by £?, so that t = 0 is a double root of the equation. For this 
reason the singular points are also sometimes called double points 
of the curve. If we remove the factor t?, we are left with the equation 


a?frz + 2abfzy + b?fyy +e= 0. 


We now inquire whether it is possible for the line to intersect the 
curve in another point that tends to (xo, yo) as the line tends to some 
particular limiting position. Such a limiting position of a secant we 
of course call a tangent. To discuss this, we observe that as a point 
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tends to (xo, yo) the quantity t tends to 0, and therefore, € also tends 
to 0. If the equation above is still to be satisfied, the expression 
afrz + 2abfcy + b?fyy must also tend to 0, that is, for the limiting 
position of the line, we must have 


a*fre + 2abfry + b?fyy = Q0. 


This equation gives us a quadratic condition determining the ratio 
a/b, which fixes the slope of a tangent. 
If the discriminant of the equation is negative, that is, if 


fzafuy — fay? < 9, 


we obtain two distinct real tangents. The curve has a double point, 
or node, like that exhibited by the lemniscate (x? + y?)2 — (x? — y?) = 
0 at the origin or by the strophoid(x? + y?) (x — 2a) + a?x = 0 at the 
point xo = a, yo = 0. 

If the discriminant vanishes, that is, if 


feafyy = fey? = 0, 


we obtain two coincident tangents; it is then possible that two 
branches of the curve touch one another or that the curve has a 
cusp. 

Finally, if 


faafyy — fry? > 0, 


there is no (real) tangent at all. This occurs for example in the case of 
the so-called isolated points of an algebraic curve. These are points at 
which the equation of the curve is satisfied but in whose neighborhood 
no other point of the curve lies. 

The curve (x? — a?) + (y? — b?)? = at + bt exemplifies this. The 
values x = 0, y = 0 satisfy the equation, but for all other values in 
the region |x|< a/2, |y|< 6/2 the left-hand side is less than the 
right. 

We have omitted the case in which all the derivatives of the second 
order vanish. This case leads to involved considerations and we shall 
not investigate it. Through such a point, several branches of the curve 
may pass, or singularities of other types may occur. 


1JIn this case, the curve need not have a singularity at all; for example, f(x, y) = 
(x — y}? at the origin. 
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Finally, we shall briefly mention the connection between these 
matters and the theory of maxima and minima. Because the first 
derivatives vanish, the equation of the tangent plane to the surface 
z = f(x, y) at a stationary point (xo, yo) is simply 


z — f(xo, yo) = 0. 
The equation 


f(x, y) — (x0, yo) = 0 


therefore gives us the projection on the x,y-plane of the curve of 
intersection of the tangent plane with the surface, and we see that the 
point (xo, yo) is a singular point of this curve. If this is an isolated 
point, in a certain neighborhood the tangent plane has no other point 
in common with the surface, and the function f(x, y) has a maximum 
or a minimum at the point (xo, yo) (cf. p. 349). If, however, the singular 
point is a multiple point, the tangent plane cuts the surface in a curve 
with two branches, and (xo, yo) is a saddle point. These remarks lead 
us precisely to the sufficient conditions that we found earlier in 
Section A.1. 


Exercises A.3 


1. Find the singular points of the following curves and discuss their 
nature: : 

(a) (x? + y?)? — 2c%(x? — y?) = 0,c #0 

(b) x2 + y2? — 2x3 — 2y3 + 2x?y? = 0 

(c) x*+y*— 2x — y) = 0 

(d) x5 — xt + 2x?y — y? = 0. 


A.4 Singular Points of Surfaces 


In a similar way we can discuss a singular point of a surface 
f(x, y, z) = 0, that is, a point for which 


f=0, fs=fy=fe=0. 


Without loss of generality we may take the point as the origin O. If 
we write 


fsz = a, fuy = B, fez = Y, fey = A, fyz = p, fez =V 
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for the values at this point, we obtain the equation 
ax? + By? + yz? + 2Q0xy + 2uyz + 2vxz = 0 


for a point (x, y, z) that lies on a tangent to the surface at O. 

This equation represents a quadratic cone touching the surface at 
the singular point (instead of the tangent plane at an ordinary point 
of the surface) if we assume that not all of the quantities a, B, . . ., v 
vanish and that the above equation has real solutions other than 
x=y=2z=0. 


Exercises A.4 


1. Using the results of Exercise 6 of A.1 examine the behavior of a surface 
in a neighborhood of a singular point. 


A.5 Connection Between Euler’s and Lagrange’s 
Representations of the Motion of a Fluid 


Let (a, b, c) be the coordinates of a particle at the time t = 0 in a 
moving continuum (liquid or gas). The motion can then be represented 
by the three functions 


x = x(a, b,c, t), 
y y(a, b, C, t), 


z = z(a, b,c, t), 


or in terms of a position vector X = X(a, b, c, t). Velocity and acceler- 
ation are given by the derivatives with respect to the time t. Thus, 
the velocity vector is X with components X, ý, z, and the acceleration 
vector is X with components x, ï, Z, all of which appear as functions 
of the initial position (a, b, c) and the parameter t. For each value of £ 
we have a transformation of the coordinates (a, b, c) belonging to the 
different points of the moving continuum into the coordinates (x, y, z) 
at the time ¢. This is the so-called Lagrange representation of the 
motion. Another representation introduced by Euler is based upon the 
knowledge of three functions 


U(x, y, 2, t), U(x, y, z, t), w(x, y, 2, t) 


representing the components X, y, ż of the velocity X of the motion 
at the point (x, y, z) at the time t. 

In order to pass from the first representation to the second we have 
to use the first representation to calculate a, b, c as functions of x, y, 
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z, and ¢ and to substitute these expressions in the expressions for 
x(a, b, c, t), y(a, b, c, t), Z(a, b, c, t): 

u(x, y, zZ, t) = x(a(x, y, z, t), b(x, y, z, t), C(x, y, Z, t), t)... 
We then get the components of the acceleration from 

x(a, b, c, t) = u(x(a, b, c, t), y(a, b,c, t), z(a, b, c, t), t)... 
by differentiation with respect to t for fixed a, b, c: 

X = UgX + UyY + UzŽ + ut... 

or 


¥ = Ugu + Uy + UzW + Ut, 


Yy = Ugl + Vyv + VzW + Ut, 
Z = Wz + Wy + Wz + Wt. 


In the mechanics of a continuum, the following equation con- 
necting Euler’s and Lagrange’s representations is fundamental: 


div X = us + vy + w= P, 


where 


_ d(x, y, z) 


D(x, y, 2, t) — d(a b c) 


is the Jacobian characterizing the transformation. 

The reader may complete the proof of this and the corresponding 
theorem in two dimensions by using the various rules for the differ- 
entiation of implicit functions (see p. 252). 


Exercises A.5 


1. What is the physical interpretation of the relations ut = vte = wi = 0. 


2. Interpret the relations 

X = Uz + Uyv + Uzw + üt, 
Ñ = Vru + vyv + vew + ur, 
2= Wru + Wy + wzw + wt 


physically; rewrite these relations using vector notation. 
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A.6 Tangential Representation of a Closed Curve and the 
Isoperimetric Inequality 


A family of straight lines with parameter a may be given by 


where p(a) denotes a function that is twice continuously differenti- 
able and periodic of period 2r (here p represents the distance of the 
line of the family with normal direction a from the origin). The en- 
velope C of these lines is a closed curve satisfying (83) and the further 
equation 


— xsina+ y cosa — p(a) = 0. 
Hence, 


x = p cosa — p' sina 
(84) 
y = psina + p' cosa 


is the parametric representation of C (a being the parameter). Formula 

(83) gives the equation of the tangents of C and is referred to as the 

tangential equation! of C, and p(a) as the support function of C. 
Since 


x = — (p + p”)sina, y'= (p + p”)cosa, 


we at once have the following expressions for the length L and area 


A of C: 


2r 2r 2r 
L= | yF yzda= |" (p+ p")da= f" pda 
0 0 0 


1 1 (2% 1 ( 
A=} [Go —yx)da=5 È" (p+ p"pda = } f" o- pa, 
2 0 2 0 2 0 


since p'(a) is also a function of period 2r.? 


1The representation of C in the form (84) is valid for any closed convex curve whose 
curvature is finite and positive, and varies continuously along C. 

2Since p(a) + c is obviously the support function of the parallel curve at a distance 
c from C, the formulae for the area and the length of a parallel curve (cf. Volume I, 
p. 437, Exercise 7, and its solution in A. Blank: Problems in Calculus and Analysis, 
p. 188) are easily derived from these expressions. 
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From this we deduce the isoperimetric inequality 
L? = 4nA, 


where the equality sign holds for the circle only. This may also be 
expressed by the statement: Among all closed curves of given length 
the circle has the greatest area. 

For the proof we make use of the Fourier expansion of p(a) (Volume 
I, p. 594), 


pla) = pi + >> (ay cos va + by sin va); 
v=] 
then 
p'a) = > v(by cos va. — ay sin va), 
v=] 


so that (using the orthogonality relations of Volume I, p. 593) we have 


L= naO, © 
2 oo 
A= 5% - SW - Ya? + ba), 
v=2 
Thus, 
Tay? _ L? 
As 4  4n; 


in particular, A = L2/4n only if ay = by = 0 for v = 2; that is, p(a) = 
ao/2 + a1 cosa + bı sin a. The latter equation defines a cirlce, as is 
easily proved from (84). 


Exercises A.6 


1. Find the equations of the envelopes, their lengths, and contained areas, 
for each of the following families of straight lines: 


(a) (x + 2) cosa+ysin«e+2=0 
(b) xcose+ysina+4sin 2a = 0. 


2. Compare the formulae for area and length. Can there exist curves of 
arbitrarily large length enclosing arbitrarily small area? 


3. Can every closed curve be represented as the envelope of lines (83)? 


CHAPTER 
4 


Multiple Integrals 


Differentiation and operations with derivatives for functions of 
several variables are directly reducible to their anologues for func- 
tions of one variable. Integration and its relation to differentiation 
are more involved, since the concept of integral can be generalized 
for functions of several variables in a variety of ways. Thus, for a 
function f(x, y, z) of three independent variables, we have to consider 
integrals over surfaces and lines, as well as integrals over regions of 
space. Nonetheless, all questions of integration will be related to the 
original concept of the integral of a function of a single independent 
variable. 

For simplicity we shall work mainly in the plane, (i.e., with two 
independent variables). However, all arguments apply equally well to 
higher dimensions with mere changes of terminology (‘‘area” by 
“volume,” “square” by “cube,” etc.). 


4.1 Area in the Plane 


a. Definition of the Jordan Meastire of Area 


In Volume I we expressed the area of a region in the x, y-plane by 
integrals of functions of a single variable. The basic idea (which led 
us to the notion of integral in the first place) was to approximate the 
region by simpler regions consisting of a finite number of rectangles. 
For a more systematic development of areas that immediately carries 
over to volumes in three or more dimensions, it is desirable to give a 
direct definition that is not tied to the idea of integration of functions 
of one variable and corresponds more closely to the intuitive notion 
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of the area of a region as the "number of square units” contained in 
the region. At the same time, this new and more natural definition is 
more general and avoids all extraneous discussion of the regularity 
of the boundary, which becomes inevitable whenever we try to reduce 
areas to single integrals. As usual, we postpone rigorous existence 
proofs to the Appendix of this chapter. Those proofs only present 
systematically what should already be more or less obvious to the 
reader from the informal discussions of ideas and purposes presented 
in the main text. 

In defining areas, we accept the intuitive idea that the area A(S) 
of a set S should be a nonnegative number attached to S that has the 
following properties: 


1. If S is a square of side k then A = k?. 

2. Additivity: The area of the whole is the sum of the areas of its 
parts. More precisely, if S consists of nonoverlapping! sets Si,.. ., 
Sw of areas A(Si) . . ., A(Sw), respectively, then the area of S is 


A(S) = A(S1) + + + + + A(Sy) 


On the basis of these simple requirements, we shall be able to assign 
a value A(S) to most of the two-dimensional sets A encountered in 
practice although not to all imaginable sets S in the plane. 

To arrive at a uniquely determined value A(S) for a bounded set S, 
we use very special divisions of the plane into squares; it will be 
shown subsequently that every other way of dividing the plane into 
squares (or rectangles) will lead to the same area. Congruent squares 
provide the easiest way of covering the plane without gaps or overlap. 
We use the grid attached to our coordinate system provided by the 
lines x = 0,+1,+2,+3,...andy=0,+1,+2,.. ., which divide the 
whole plane into closed squares of side 1. We denote by A(S) the 
number of squares having points in common with S and by A,(S) the 
number of those completely contained in S. We next divide each 
square into four equal squares of side + and area } and denote by 
Aj(S) one-fourth of the number of those subsquares having points with 
S and by A, (S) one-fourth of the number of those completely contained 
in S. Since each unit square completely contained in S gives rise to 
four subsquares completely contained in Swe have A,(S) < A,(S), and 
similarly Aj(S) = Aï (S). We next divide each square of side + further 
into 4 squares of side 4. One-sixteenth of those squares having points 


1The sets are nonoverlapping if every interior point of one of the sets is exterior to all 
the other sets. We call the sets disjoint if every point of one of the sets belongs to no 
others. 
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in common with S and one sixteenth of those contained in S will be 
denoted, respectively, by A;(S) and A,(S). Proceeding in this fashion, 
we associate values A;(S) and A,(S) with a division of the plane into 
squares of side 2-"(see Fig. 4.1). Itis clear that the values A,(S) forma 


Figure 4.1 Interior and exterior approximations to the 
area of the unitdisk x? + y2? < 1, for n = 0, 1, 2, where 
Ap =0, A; = 1, A, = 2, A; = 44, A; =6, A, = 12. 


monotone decreasing and bounded sequence that converges toward 
a value A‘(S), while the A,,(S) increase monotonically and converge 
towards a value A (S). The value A (S) represents the inner area, the 
closest we can approximate the area of S from below by congruent 
squares contained in S; the outer area A”(S) represents the best upper 
bound obtainable by covering S by congruent squares. If both values 
agree, we say that S is Jordan-measurable and call the common value 
A’(S) = A‘(S) the content, or the Jordan-measure, of S. We shall use 
the simpler term area A(S) for the content of S, and shall say “S has 
an area” instead of using the clumsier phrase “S is Jordan-measur- 
able” to denote the fact that A (S) = A‘(S), (which is true for almost 
all sets occurring in practice). 

The difference A,(S) — A;(S) represents the total area of the 
squares in the nth subdivision that have points in common with S 
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without lying completely in S. All these squares contain boundary 
points of S, so that 


A(S) — A,(S) < A(S) 


where 0S is the boundary of S. If the boundary of S has the area 0, we 
find that 


A*(S) — A(S) = lim[A,(S) — A,(S)] = lim A,@S) = 0, 


that is, that S has an area. Thus, S has an area if its boundary dS has 
area 0. (This condition is also necessary; see p. 518). 

In order to verify that a given set S has an area or that 0S has area 
0 we would have to show that the total area of the squares in the nth 
subdivision that have points in common with 0S is arbitrarily small 
for n sufficiently large. Actually, it is not necessary to use squares of 
side 2” for this analysis. A set S certainly has an area if for everys > 0 
we can find a finite number of sets Si, . . ., Sw that cover the boundary 
dS of S and have total area < £. Then, for any n, obviously 


A,(0S) < A;,(Si) + + + + A,(Sy), 


since any square that has points in common with dS has points in 
common with at least one of the sets Sı, . . ., Sy. Here, for n —> o, 
the right-hand side tends to the sum of the areas of the Sı, which is less 
than g; thus A‘(dS) < £; since € is an arbitrary positive number, 
we conclude that A‘(dS) = 0. 

This criterion is sufficient to establish that most of the common 
regions S encountered in analysis have area. In particular, it is suff- 
cient to know that the boundary of S consists of a finite number of arcs 
each of which has a continuous nonparametric representation y = f(x) 
or x = g(y) with f or g, respectively, continuous in a finite closed in- 
terval. The uniform continuity of continuous functions in bounded 
closed intervals immediately permits us to show that these arcs can be 
covered by a finite number of rectangles of arbitrarily small total 
area.! 


b. A Set That Does Not Have an Area 


An example of a set that does not have an area in our sense (or is 
not “Jordan-measurable’”’) is the set S of “rational” points in the 
unit-square, that is, the set of points whose coordinates x, y are both 


1We leave as an exercise for the reader to prove that a rectangle with sides parallel 
to the axes has an area (as defined here) equal to the product of two adjacent sides. 


Multiple Integrals 371 


rational numbers between 0 and 1. It is evident from the density 
property of rational and irrational numbers that 


Ai,=1, A, =0 


for all n, so that S has outer area 1 and inner area 0. This agrees with 
the fact that the boundary 3S of S consists of the whole closed unit- 
square and has area 1. If we cover S in any way by a finite number of 
closed sets Si,..., Sy with areas A(Si), . . ., A(Sw), respectively, 
then 


A(Si1) + +++ + A(Sy) 21 


since the S; necessarily also cover the boundary 0S of S (see Exercise 
6). Paradoxically, however, it is possible to cover S by an infinite 
number of closed sets S; of arbitrarily small total area. We only have 
to use the fact that the pairs (x, y) of rational numbers form a de- 
numerable set (see Volume I, p. 98).! Thus, the points of S can be 
arranged into an infinite sequence (x1, y1), (x2, y2), (x3, y3),. . .. Let € 
be an arbitrary positive number. Denote for each integer m > 0 by 
Sm a square of area €2—™ and center (Xm, Ym). Then the Sm cover the 
whole set S, while their total area is given by 


€ € € € 
ztgta tpt UTS 


Thus, coverings by infinitely many unequal squares can lead to a 
substantial lowering of the upper bound A‘(S) for the “area” of S, 
reflecting more closely the “rarity” of the rational points among the 
real ones. One of the starting points in the refined theory of measuring 
sets, originated by Lebesgue, is to define the outer area of a set as the 
greatest lower bound of the sum of areas of any finite or infinite set of 
squares covering it. For our set S this outer Lebesgue area has the 
value 0, the same as the inner area of S. Incidentally, for a closed and 
bounded set S the two definitions of outer area agree, since by the 


1We can arrange them, for example, in groups, according to the size of the larger of 
the two denominators; each group has only a finite number of elements: 


e) ea ea ea ea ea 


6 Ga EJ Ga. bak 
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Heine-Borel theorem (cf. p. 109) any infinite covering of S already 
contains a finite covering. 


c. Rules for Operations with Areas 


In most cases that interest us we can establish the existence of an 
area of a set S by verifying that Sis bounded by a finite number of arcs 
with continuous nonparametric representation. For that reason one 
might be tempted to exclude all other regions with more complicated 
boundaries from consideration. It turns out however that such a re- 
striction not only results in a loss of generality but actually compli- 
cates matters, since we have to make sure that the regions resulting 
from the operations of set union and intersection again have simple 
boundaries. The advantage of our general definition of area as content 
is that it is based on the primitive notion of counting of squares; 
nothing is postulated about the boundary at all beyond the require- 
ment that it can be covered by a finite number of squares of arbitrarily 
small total area. The boundary of a Jordan-measurable set can be 
very complicated in detail, consisting perhaps of infinitely many 
closed curves. These complications will have no effect in the theory of 
integration, as long as we can show that the total contribution arising 
from the boundary is negligible. 

For work with areas, the operations of dividing a set into subsets 
and of combining sets into larger ones are basic. The important point 
is that applying these operations we stay within the class of sets that 
have areas. We have the fundamental theorem that the union S U T 
and the intersection S N T of two Jordan-measurable sets S and T are 
again Jordan-measurable.: This follows immediately from the fact that 
the boundaries of S U T and of S N T consist of boundary points of 
S or T and, hence, have again area 0 (see p. 521). 

For the important case of two nonoverlapping sets S, T—that is, 
sets such that no interior point of one belongs to the other set or to 
its boundary—the law of additivity for areas holds: 


A(S U T) = A(S) + A(T). 


More generally, for any finite number of Jordan-measurable sets Sh, 


Sz, . . ., Sy, no two of which overlap, we have the relation 
N N 
(1) A(U Si) = 3; A(S). 


1We remind the reader that the union of sets consists of the points belonging to at 
least one of the sets and the intersection of those points belonging to all. 
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The proof is trivial on the basis of the inequalities 
N N 
Ax(U Si) < $ ANS) 
_(N N oy 
A,(U Si] 2 X ANS). 
i= i= 


Here the first inequality follows simply from the fact that any square 
that has points in common with the union of the S; must have points 
in common with at least one of the S;. The second one follows from 
the fact that any square contained in one set S; cannot be contained 
in any other S;(since the two are nonoverlapping) but is contained in 
their union. For n —> œ, we conclude that 


+(N N 4+ 
A (U Si) x $, AS) 
i= i= 
_{N N ._ 
A (Ù Si) > > A (Si). 
1=1 1=1 
From the assumption that the S; have areas, that is, that 


A‘(S:) = A (Si) = A(S), 


and that the inner area of the union cannot exceed the outer area, 
the equation (1) follows. 

It is now easy to verify that “areas” as defined here can be ex- 
pressed in terms of integrals in the specific instances considered in 
Volume I. For example, let the set S consist of the points ‘“‘below”’ the 
graph of a continuous positive function y = f(x) in an interval a < x 
< b. that is, the set of points (x, y) for which 


axx<sb, 0OSySf(x). 


Consider any subdivision of the interval [a, b] into N subintervals of 
length Axı, and let m; be the minimum and M; the maximum of f(x) 
in the ith subinterval. The rectangles with base Ax; and height mi 
are clearly nonoverlapping and their union is contained in S, so that 


$ mi Axi S A(s). 
Similarly, 
A(S) < 3° Mi Ax. 
1=1 
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For continuous f, the lower and upper sums both tend to the integral 
of f and we arrive at the classical expression 


b 
(2) A(S) = | f(x) dx 
for the area of S. 


Exercises 4.1 


1. Show that if Sand T have area and if S is contained in T, then A(S) < 
A(T). 


2. Under the hypothesis of Exercise 1, show that T — S has area, where 
T — Sis the set of points of T that are not contained in S. 


3. Show that if S and T are bounded, 
(a) A*(S U T) + A*(S N T) < A*(S) + A(T) 
(b) A(SUT)+A(SAT)2A(S)+4A(T) 

4. Let S and T be any disjoint sets whose union has area. Show that 
A*(S) + A(T) = A(S U T). 


5. (a) Show that if a set S has area in one coordinate system, it has area in 
any other coordinate system obtained by rotation and translation of 


axes. 
(b) Show that the area of S is the same in both coordinate systems. 
6. Let S be covered by a finite collection Si, . . ., Sw of closed sets. Show 


that the collection also covers the boundary ôS of S. 


7. Does the set S of points (1/p, 1/q), where p and q are natural numbers, 
have an area? 


4.2 Double Integrals 


a. The Double Integral as a Volume 


Everything said about areas in the preceding paragraphs carries 
over immediately to volumes in three or higher dimensions. In de- 
fining the volume V(S) of a bounded set S in x, y, z-space, we need 
only use subdivisions of space into cubes of side 2-”. The set S will 
have a volume when its boundary can be covered by a finite number 
of these cubes of arbitrarily small total volume. This is the case for 
all bounded sets S whose boundary consists of a finite number of 
surfaces each of which has a continuous nonparametric represen- 
tation z = f(x, y) or y = g(x, z) or x = A(y, z) on a closed planar set. 

The attempt to represent the volume analytically leads directly to 
the notion of multiple integral, which has a great variety of ap- 
plications. 
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Let R, a Jordan-measurable closed and bounded set in the x, y- 
plane be the domain of a positive-valued function z = f(x, y). We wish 
to find the volume “below” the surface z = f(x, y), that is, the volume 
V(S) of the set S of points (x, y, z) for which 


(x,y) ER, OSz2zS f(x,y). 


For this purpose, we divide R into nonoverlapping closed Jordan- 
measurable sets Ai,..., Rw. Let m be the minimum, and M; the 
maximum, of f for (x, y) in Ri. It is easily seen that the cylinder with 
base R; and height m; has the volume mA(R:), where A(R:) is the 
area of R; (Fig. 4.2).1 These cylinders do not overlap. Similarly, the 


cylinders with base A; and height M; have volume M;A(R;) and do not 
overlap. It follows that 


(3a) Ym A(R) < WS) < 3) M:A(Ri) 


1When we divide space into cubes of side 27”, the cubes having points in common with 
the cylinder can be arranged into cylindrical “columns” whose cross section is a 


square having a point in common with R; and whose height differs by less than 2-7” 
from mi. 
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The sums appearing in this inequality we call, respectively, the lower 
and upper sums. 

We now make our subdivision finer and finer, in the sense that the 
largest diameter of any R; occuring in the subdivision tends to zero.! 
The continuous function f(x, y) is uniformly continuous in the com- 
pact set R, so that the maximum difference M; — m tends to zero with 
the maximum diameter of the sets R; in the subdivision. The difference 
between the upper and lower sums also tends to zero, since 


$, M.A(Ri) — X mm A(R) 
= 3, (Mi — mi)A(Rı) < Max(Mi — mo] X; ARs) 


= [Max(M; — m)]A(R). 


It follows from (8a) that the upper and lower sums both converge to 
the limit V(S) as we refine our subdivision indefinitely. We can obvi- 
ously obtain the same limiting value if instead of m; or Mı we take any 
number between m; and Mi, such as f(xi, yi), the value of the function 
at a point (x, yi) of the set Ri. We shall call the limit V(S) the double 
integral of f over the set R and write 


(3b) VS) = [f Az, »)AR. 


b. The General Analytic Concept of the Integral 


The concept of double integral as volume suggested by geometry 
must now be studied analytically and be made more precise without 
reference to intuition. We consider a closed and bounded Jordan- 
measurable set R with area A(R) = AR, and a function f(x, y) that is 
continuous everywhere in R (including the boundary). As before, we 
subdivide R into N nonoverlapping Jordan-measurable subsets fi, 
Rz, . . ., Ry with areas ARi, . . ., ARy. In Ri we choose an arbitrary 
point (&,n:), where the function has the value fi = f(&,n:) and we 
form the sum 


N N 
Vu = 2 fiAR: = 2 fiA(Ri). 


=] 
The fundamental existence theorem then states: 


1The “diameter” of a closed set is the maximum distance of any two points in the set. 
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If the number N increases beyond all bounds and at the same time 
the greatest of the diameters of the subregions tends to zero, then Vy 
tends to a limit V. This limit is independent of the particular nature of 
the subdivision of the regions R and of the choice of the point (&:, ni) 
in Ri. We call the limit V the (double) integral of the function f(x,y) 
over the region R and denote it by 


Í J. f(x, y)dR.} 


COROLLARY. We obtain the same limit if we take the sum only over 
those subregions R; that lie entirely in the interior of R,thatis, which 
have no points in common with the boundary of R.? 

This existence theorem for the integral of a continuous function 
must be proved in a purely analytical way. The proof, which is very 
similar to the corresponding proof for one variable, is given in the 
appendix to this chapter (p. 526). 

We now illustrate this concept of an integral by considering some 
special subdivisions. The simplest case is that in which R is a rec- 
tangle axx<=b, cXy<Xd and the subregions Rı are also rec- 
tangles (formed by subdividing the x-interval into n equal parts and 
the y-interval into m equal parts) of lengths 
b—a d— c 


and k = 
n m 


h = 


1We can refine this theorem further in a way useful for many purposes. In the sub- 
division into N subregions it is not necessary to choose a value that is actually as- 
sumed by the function f(x, y) at a definite point (E:, n:) of the corresponding subre- 
gion; it is sufficient to choose values that differ from the values of the function 
f(&i, nt) by quantities that tend uniformly to 0 as the subdivision is made finer. In 
other words, instead of the values of the function f(&, ni) we can consider the 
quantities 


fi = f&s, ni) + Ei, N 
where |ein| S EN, Jim en = 0. This theorem is almost trivial, for, since the numbers 


€, tend uniformly to 0, the absolute value of the difference between the two sums 
N N 
x fi AR: and x (fi + &:,m AR: 


is less than ey >) AR:, and can be made as small as we please if we take the number 
N sufficiently large. For example, if f(x, y) = P(x, y) Q(x, y), we may take fi = PiQi, 
where P; and Q; are the maxima of P and Qin Ri, which are in general not assumed 
at the same point. 

The corollary follows from the fact that not only the boundary @R of R but also 
the set of all points sufficiently close to dR can be covered by squares of arbitrarily 
small total area. 
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The points of subdivision we call xo = a, xı, X2,..., Xn = b and 
yo = C, Y1, Y2, . > +> Ym = d. They correspond to parallels to the y-axis 
and x-axis, respectively. We then have N = nm. The subregions are 
all rectangles with area A(Ri) = AR; = hk = Ax Ay, where h = Ax, 
k = Ay. For the point (&, ni) we take any point in the corresponding 
rectangle Ri, and then form the sum 


3 f&n Ax Ay 


for all the rectangles of the subdivision. 

If we now let n and m simultaneously increase beyond all bounds, 
the sum tends to the integral of the function f over the rectangle R. 

These rectangles can also be characterized by two suffixes p and v, 
corresponding to the coordinates x = a + vh and y = c + uk of the 
lower left-hand corner of the rectangle in question. Here v assumes 
integral values from 0 to (n — 1) and p from 0 to (m — 1). With this 
identification of the rectangles by the suffixes v and p, we may ap- 
propriately write the sum as a double sum! 


(3c) Sy Ss fey Ax Ay. 


Even when R is not a rectangle, it is often convenient to subdivide 
the region into rectangular subregions R;. To do this we superimpose 
on the plane the rectangular net formed by the lines 


= vh (v=0,+1,+2,...) 
y= uk (u=0,+1,+2,...), 


where h and k are numbers chosen arbitrarily. We now consider all 
those rectangles of the division that lie entirely within R. These rec- 
tangles we call R;. Of course, they do not completely fill the region; 
on the contrary, in addition to these rectangles R also contains 
certain regions R; adjacent to the boundary that are bounded partly 
by lines of the net and partly by portions of the boundary of R. By the 
corollary on p. 377 we can calculate the integral of the function f over 
the region R by summing over the interior rectangles only and then 
passing to the limit. 

Another type of subdivision frequently applied is the subdivision 
by a polar coordinate net (Fig. 4.3). We subdivide the entire angle 27 


1If we are to write the sum in this way, we must suppose that the points (&, ni) are 
chosen so as to lie in vertical or horizontal straight lines. 
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Figure 4.3 Subdivision by polar coordinate nets. 


into n parts of magnitude A90 = 2r/n = h, and we also choose a 
second quantity k = Ar. We now draw the lines 0 = vA(v = 0, 1, 2, 

. .»” — 1) through the origin and also the concentric circles rą = uk 
(u = 1,2, .. .). Those that lie entirely in the interior of R, we denote 
by Ri, and their areas, by Akı. We can then regard the integral of the 
function f(x, y) over the region È as the limit of the sum 


DHE, ni)ARi, 


where (&, ni) is a point chosen arbitrarily in R;. The sum is taken 
over all the subregions A; in the interior of R, and the passage to the 
limit consists in letting h and k tend simultaneously to zero. 

By elementary geometry the area AR; is given by the equation 


AR: = (ru? — rh = Zn + DR, 


if we assume that R; lies in the ring bounded by the circles with radii 
uk and (u + 1)k. 


c. Examples 


The simplest example is the function f(x, y) = 1. Here the limit of 
the sum is obviously independent of the mode of subdivision and is 
always equal to the area of the region R. Consequently, the integral 
of the function f(x, y) = 1 over the region is also equal to this area. 
This might have been expected, for the integral is the volume of the 
cylinder of unit altitude with the region R as base. 

As a further example, we consider the integral of the function 
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f(x, y) = x over the square 0 < x < 1,0 < y < 1. The intuitive inter- 
pretation of the integral as a volume shows that the value of our 
integral must be 4. We can verify this by means of the analytical 
definition of the integral. We subdivide the rectangle into squares of 
side h = 1/n, and for the point (&, ns) we choose the lower left-hand 
corner of each small square. Then each square in the vertical column 
whose left-hand side has the abscissa vh contributes the amount vh® 
to the sum. This expression occurs n times. Thus, the contribution of 
the whole column of squares amounts to nvh? = vh?. We now form 
the sum from v = 0 to v = n — 1, to obtain 


t p Mn—1),,_1_h 
>| vk? = 9 k= 3 


The limit of this expression as h > 0 is 4, as we stated. 

In a similar way we can integrate the product xy or, more generally, 
any function f(x, y) that can be represented as a product of a function 
of x and a function of y in the form f(x, y) = g(x) w(y), provided that the 
region of integration is a rectangle with sides parallel to the axes, 
saya Sx <b,c <y <d. We use the same division of the rectangle 
as in (3c), and for the value of the function in each subrectangle we 
take the value of the function at the lower left-hand corner. The 
integral is then the limit of the sum 


n-1 mzı 
nk S S g(vh)w(uk) 
which may be written as the product of two sums in the form 
nzi m-1 
S hgh) S kyluk). 


From the definition of the ordinary integral, as h > 0 and k > 0 these 
factors tend to the integrals of the corresponding functions over the 
respective intervals from a to b and from c to d. We thus obtain the 
general rule that if a function f(x, y) can be represented as a product 
of two functions ¢(x) and y(y), its double integral over a rectanglea < x 
< b, c < y < d can be resolved into the product of two integrals: 


[f Kay) dedy = J” gods- [ vO) ay. 


This rule and the summation rule (cf. (4b), p. 383) yield theintegral of 
any polynomial over a rectangle with sides parallel to the axes. 
As a last example, we consider a case in which it is convenient to 
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use a subdivision by the polar coordinate net instead of a subdivision 
into rectangles. Let the region R be the circle with unit radius and 
center at the origin, given by x? + y? < 1, and let 


f(x, y) = y] — x? — y*, 


The integral of f over R is merely the volume of a hemisphere of unit 
radius. 

We construct the polar coordinate net as before. The subregion 
lying between the circles with radii ry = uk and ravi = (u + LR and 
between the lines 9 = vh and 0 = (v + 1)h, where h = 2n/n yields the 
contribution 


In se DL r 
1/1 [t (rat? — 12h = V1 — Pu? pukh, 


where for the value of the function in the subregion R: we have taken 
the value that the function assumes on an intermediate circle with 
the radius py = (ru+1 + r,)/2. All subregions that lie in the same ring 
give the same contribution, and since there are n = 2x/h such regions 
the contribution of the whole ring is 


20 / 1— Py? Puk. 


The integral is therefore the limit of the sum 
m-i nm 
pA 2ny1 — Pu? Puk. 


As we already know, this sum tends to the single integral 


1 1 
an | ry Pdr = — £ J07 = 
0 0 3 


We therefore obtain 


ff, a= yar =, 


in agreement with the known formula for the volume of a sphere. 


d. Notation. Extensions. Fundamental Rules 


The rectangular subdivision of the region R is associated with the 
symbol for the double integral used since Leibnitz’s time. Starting 
with the symbol 
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n-1m—1 


2, 2, f(y, nu)Ax Ay 


v=0 H=0 


for the sum over the rectangles, we indicate the passage to the limit 
from the sum to the integral by replacing the double summation sign 
by a double integral sign and writing the symbol dx dy instead of the 
product of the quantities Ax Ay. Accordingly, the double integral is 
frequently written in the form 


IJ fe 9) dx dy 


instead of the form 


JJ fle.) aR 


in which the area AR is replaced by the symbol dR. At this stage the 
symbol dx dy merely refers symbolically to the passage to the limit of 
the above sums of nm terms as n > œ and m > œ. 

It is clear that in double integrals, just as in ordinary integrals of 
a single variable, the notation for the variables of integration is im- 
material, so that we could equally well have written 


J, f(u,v)dudv or Í f. f(E, n) dE dn. 


In introducing the concept of integral, we saw that for a positive 
function f(x, y) the integral represents the volume under the surface 
z = f(x, y). In the analytical definition of integral, however, it is quite 
unnecessary that the function f(x, y) should be positive everywhere; 
it may be negative, or it may change sign, in which case the surface 
intersects the region R. Thus, in the general case the integral gives 
the volume in question with a definite sign, the sign being positive 
for surfaces or portions of surfaces that lie above the x, y-plane. If the 
whole surface consists of several such portions, the integral rep- 
resents the sum of the corresponding volumes taken with their 
proper signs. In particular, a double integral may vanish, although the 
function under the integral sign does not vanish everywhere. 

For double integrals, as for single integrals, the following funda- 
mental rules hold; their proofs are simple repetitions of those in 
Volume I (p. 138). If c is a constant, then 


(4a) J| cf y) dR = c |] K, y) aR. 
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Furthermore, the integral of the sum of two functions is equal to the 
sum of their two integrals (linearity of the operation of integration): 


(4b) [S Ey) + oem aR = Sf fa, aR + ff (x,y) dR. 


Finally, if the region R consists of two subregions R’ and R” that have 
at most portions of the boundary in common, then 


(4c) sf] fs ar = fff, dR + Jf, fey) aR; 


that is, when regions are joined together the corresponding integrals 
are added (additivity of integrals). 


e. Integral Estimates and the Mean Value Theorem 


As for ordinary integrals, there are some very useful estimates for 
double integrals. Since the proofs are practically the same as those of 
Volume I (p, 138), we shall be content to merely state the facts. 

If f(x, y) = 0 in R, then 


(5a) J| fle.) aR = 0; 
similarly, if f(x, y) < 0, 
(5b) f| Aæ dR < 0. 


This leads to the following result: If the inequality 
(5c) f(x, y) Z g(x, y) 


holds everywhere in R, then 


(5d) f ii f(x, y) dR 2 | ii g(x, y) dR. 


A direct application of this theorem gives the relations 


(5e) [fe nar s ff Ify) aR 


and 
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(6f) SS te aR z — ff If y) aR. 
We can also combine these two inequalities in a single formula: 


(58) | fpr aR] s Sf 11 aR, 


If m is the greatest lower bound and M the least upper bound of 
the function f(x, y) in R, then 


(6) m AR < ||, f(x,y) dR < MAR, 


where AR is the area of the region R. The integral can then be ex- 
pressed in the form 


(Ta) J| fle») dR = p AR, 


where yu lies between m and M. The precise value of p cannot in gen- 
eral be specified more exactly.! 

This form of the estimation formula we again call the mean value 
theorem of the integral calculus. 

Here again the following generalization holds: If p(x, y) is an ar- 
bitrary positive continuous function in R, then 


(7b) [J p@ DAE y) dR = u | p(x, ») dR, 


where u denotes a number between the greatest and least values of 
f that cannot be further specified. 

As before, these integral estimates show that the integral varies 
continuously with the function. More precisely, let f(x, y) and g(x, y) 
be two functions that in the whole region R satisfy the inequality 


| f(x, y) = g(x, y)| < E, 


where ¢ is a fixed positive number. If AR is the area of R, then the in- 
tegrals {fp f(x, y) dR and Sfr ¢(x, y) dR differ by less than € AR, that 
is, by less than a number that tends to zero with e. 

In the same way, we see that the integral of a function varies con- 
tinuously with the region. Suppose that two regions FR’ and R” are 


1Just as for integrals of continuous functions of one variable, the value p is certainly 
assumed at some point of the set R by the function f(x, y) if R is connected and fis 
continuous. 
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obtained from one another by the addition or removal of portions 
whose total area is less than e, and let f(x, y) be a function continuous 
in both regions such that | f(x, y)| < M, where M is a fixed number. 
The two integrals ffp f(x, y) dR and Sfp f(x, y)dR then differ by less 
than Me, that is, by less than a number that tends to zero with e. 
The proof of this fact follows at once from formula (4c) of p. 383. 

We can therefore calculate the integral over a region R as accurate- 
ly as we please by taking it over a subregion of R whose total area 
differs from the area of R by a sufficiently small amount. For example, 
in the region R, we can construct a polygon whose total area differs 
by as little as we please from the area of R. In particular, we may 
suppose this polygon to be bounded by lines parallel to the x- and y- 
axes alternately, that is, to be pieced together out of rectangles with 
sides parallel to the axes. 


4.3 Integrals over Regions in Three and More Dimensions 


Every statement we have made for integrals over regions of the 
x, y-plane can be extended without further complication or introduc- 
tion of new ideas to regions in three or more dimensions. For example, 
to treat the integral over a three-dimensional region R, we need only 
subdivide R (e.g, by means of a finite number of surfaces with con- 
tinuous nonparametric representations) into closed nonoverlapping 
Jordan-measurable subregions Ri, Re, . . ., Ry that completely fill R. 
If f(x,y,z) is a function that is continuous in the closed region R 
and if (&, nz, ¢:) denotes an arbitrary point in the region Ai, we again 
form the sum 


x fléi, ni, GAR, 


in which AR; denotes the volume of the region R;. The sum is taken 
over all the regions R; or, if it is more convenient, only over those sub- 
regions that do not adjoin the boundary of R. If we now let the number 
of subregions increase beyond all bounds in such a way that the diame- 
ter of the largest of them tends to zero, we again find a limit in- 
dependent of the particular mode of subdivision and of the choice of 
the intermediate points. This limit we call the integral of f(x, y, z) 
over the region R, and we denote it by 


(7c) Í ii f(x, y, z) dR. 


In particular, if we effect a subdivision of the region into rectangular 
regions with sides Ax, Ay, Az, the volumes of the inner regions Ri 
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will all have the same value Ax Ay Az. As on p. 382, we indicate the 
passage to the limit through the notation 


Wp, fee y, 2) dx dy dz. 


Apart from the necessary changes in notation, all the facts that we 
have mentioned for double integrals remain valid for triple integrals. 
For regions of more than three dimensions, once we have suitably 
defined the concept of volume for such regions, the multiple integral 
can be defined in exactly the same way. If we restrict ourselves to rec- 
tangular subregions and define the volume of a rectangular region 


Qi S Xi < Qi + hi @=1,2,...,n) 


as the product hihe. . . hn, the definition of integral involves nothing 
new. We denote an integral over the n-dimensional region R by 


ff- e | fen, X2, . . ., Xn) dxı dxz » + + dxn. 


For more general regions and more general subdivisions we must rely 
on the abstract definition of volume given in the Appendix. 

In what follows, we confine ourselves to integrals in at most three 
dimensions. 


4.4 Space Differentiation. Mass and Density 


For functions of one variable, the integrand is the derivative of the 
integral. This fact represents the fundamental connection between dif- 
ferential and integral calculus. For the multiple integrals of functions 
of several variables, the same connection exists; but here it is not so 
fundamental in character. 

We consider the multiple integral (domain integral) 


I fe») dB or i fæ y, z) dB 


of a continuous function of two or three variables over a region B 
that contains a fixed point P with coordinates (xo, yo) or (Xo, Yo, Zo), 
respectively, and which has the content AB. Dividing this integral 
by the content AB, it follows from formula (7a) that the quotient is 
an intermediate value of the integrand, that is, a number between the 
greatest and the least values of the integrand in the region. If we let 
the diameter of the region B about the point P tend to zero, so that the 
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content AB also tends to zero, this intermediate value of the func- 
tion f must tend to its value at the point P. Thus, the passage to the 
limit yields the respective relations 


lim 5 f f f(x, y)dB = f(xo, yo) 


AB~0 


and 


(8) lim a5 SJJ, f Y DAB = faa, yo, 20). 
This limiting process, which parallels the process of differentiation 
for integrals with one independent variable, we call space differentia- 
tion of the integral. We see, then, that space differentiation of a mul- 
tiple integral gives the integrand. 

We can interpret the relation of integrand to integral in the case of 
several independent variables, by means of the physical concepts of 
density and total mass. We think of amass of asubstance as distributed 
over a three-dimensioned region R in such a way that an arbitrarily 
small mass in contained in each sufficiently small subregion. In order 
to define the specific mass or density at a point P, we first consider a 
neighborhood B of the point P with content AB and divide the total 
mass in this neighborhood by the content. The quotient we shall call 
the mean density or average density in this subregion. If we now let 
the diameter of B tend to zero, from the average density in the region 
B we obtain a limit called the density at the point P, provided always 
that such a limit exists independently of the choice of the sequence 
of regions. If we denote this density by u(x, y, z) and assume that it is 
continuous, we see at once that the process described above yields the 
same value as the differentiation of the integral 


f| may, 2) av, 


taken over the whole region R. This integral taken over the whole re- 
gion therefore represents the total mass of the substance of density u 
in the region! R. 


1What we have shown is only that the distribution given by the multiple integral has 
the same space-derivative as the mass-distribution originally given. It remains to be 
proved that this implies that the two distributions are actually identical; in other 
words, thatthe statement “space differentiation gives the density p” can be satisfied 
by only one distribution of mass. The proof, although not difficult, is passed over 
here. We have to assume that mass is additive, that is, that for a region R consisting 
of two nonoverlapping regions R’ and R”, the mass of R is the sum of the masses of 
R' and R”. 
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From the physical point of view such a representation of the mass 
of a substance is naturally an idealization. That this idealization is 
reasonable, that is, that it approximates to the actual situation with 
sufficient accuracy, is one of the assumptions of physics. 

These ideas, moreover, retain their mathematical significance even 
when p is not positive everywhere. Negative densities and masses 
may also have a physical interpretation, for example, in the study of 
the distribution of electric charge. 


4.5 Reduction of the Multiple Integral to Repeated Single 
Integrals 


The fact that every multiple integral can be reduced to single in- 
tegrals is of fundamental importance in the evaluation of multiple 
integrals. It enables us to apply all the methods that we have previous- 
ly developed for finding indefinite integrals to the evaluation of mul- 
tiple integrals. 


a. Integrals over a Rectangle 


First we take the region R as a rectangle ax<x<xb,axy<fB 
in the x, y-plane and consider a continuous function f(x, y) in R. We 
then have the theorem: 


To find the double integral of f(x, y) over the region R, We first regard 
y as constant and integrate f(x, y) with respect to x between the limits 
a and b. This integral 


$y) =f fla, 9) dex 


is a function of the parameter y, which we integrate between the limits 
a and B to obtain the double integral. In symbols, 


Lf (x,y) dR = f $y) dy, $) = f f(x, y) dx, 
or more briefly, 


(9a) S| Aey) dR = f° dy f? Ræ, y) da. 


In order to prove this statement, we return to the definition of the 
multiple integral (3c). Taking 


Multiple Integrals 389 


we have 


Í Jf, y)dk = lim 2 È f(a + ph, a + vk)hk. 


no 


Here the limit is to be understood to mean that the sum on the right- 
hand side differs from the value of the integral by less than an arbi- 
trarily small preassigned positive quantity £, provided only that the 
numbers m and n are both larger than a bound N depending only on 
e. By introducing the expression! 


D, = Sfla + uh, a + vk)h 
u=] 
we can write this sum in the form 
n 
>) Dyk. 
v=] 


If we now choose an arbitrary fixed value for £ and for n choose a fixed 
number greater than N, we know that 


alice dR—k 30 |<e 


no matter what the number m is, provided only that it is greater than 
N. If we keep n fixed and let m tend to infinity, the above expression 
never exceeds £. According to the definition of the ordinary integral, 
however, in this limiting process the expression ®, tends to the inte- 
gral 


f f(x, a + vk) dx = ġ(a + vk), 
and, therefore, we obtain 


Sf» dkR—k > go + vk)| S €. 


1The root idea of the following proof is simply that of resolving the double limit as 
m and n increase simultaneously into the two successive single limiting processes: 
first, m — co when n is fixed, and then, n —> œ. 
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whatever the value of e, this inequality holds for all values of n that 
are greater than a fixed number N depending only one. If we now let 
n tend to œo (1.e., let k tend to zero), then by the definition of “integral” 
and the continuity (see p. 74) of 


J ” f(x, y) dx = (9) 
we obtain 
lim k >» g(a + vk) = f Ay) dy; 
whence 


Safe) ak — S 80) dy | <e. 


Since £ can be chosen as small as we please and the left-hand side is 
a fixed number, this inequality can only hold if the left-hand side 
vanishes, that is, if 


IEE y) dR = f dy f” f(x, y) dx. 


This gives the required transformation. 

The result permits one to reduce double integration to two succes- 
sive single integrations. 

Since the parts played by x and y are interchangeable, no further 
proof is required to show that the equation 


(9b) J| fle. dR = f’ dx [Ræ 9) dy 


is also true. 


b. Change of Order of Integration. Differentiation under the 
Integral Sign 


The two formulae (9a), (9b) yield the relation 


(90) fi ay f° Ney) dx = f dx f fx, 9) dy 


(already proved in a different way on p. 80) or, in words: 
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In the repeated integration of a continuous function with constant 
limits of integration the order of integration can be reversed. 

The theorem on the change of order in integration has many ap- 
plications. In particular, it is frequently used in the explicit calcula- 
tion of simple definite integrals for which no indefinite integral can 
be found. 

As an example (for further examples see the Appendix), we con- 
sider the integral 


°° eae — ex 
0 x 


which converges for a > 0, b > 0. We can express I as a repeated in- 
tegral in the form 


I= [" dx f’ e dy. 


In this improper repeated integral we cannot at once apply our theo- 
rem on change of order. If, however, we write 


T b 
I = lim o dx J e7% dy, 


T= 00 


we obtain by changing the order of integration 


b 1 — o-Ty b o-Ty 
I = lim “dy = log? — lim e 


dy. 


In virtue of the relation 


a IY 


the second integral tends to zero as T increases; hence, 
œ eat — e—bz b 

11 = 2 dx= 2. 

(11a) I f x dx = log a 


In a similar way we can prove the following general theorem: 
If f(t) is sectionally smooth for t = 0 and if the integral 


f @a 
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exists, then for positive a and b 
(11b) I= f (law) - 109) ay = fO) log ?. 
0 


Here we can again express the single integral as the repeated in- 
tegral 


oo a 
I= f, dx f, f(xy) dy 
and change the order of integration. 


c. Reduction of Double to Single Integrals for More General 
Regions 


By a simple extension of the results already obtained, we can derive 
analogous results for regions more general than rectangles. We begin 
by considering a convex region R, that is, a region whose boundary 
curve is not cut by any straight line in more than two points unless 
the whole straight line between these two points is a part of the bound- 
ary (Fig. 4.4). We suppose that the region lies between the lines of 


Figure 4.4 General convex region of integration. 


support (i.e., lines containing a boundary point of R but not separat- 
ing any two points of R) x = xo, x = xı and y = yo, y = yı, respec- 
tively. Since the x-coordinate for any point of R lies in the interval 
xo < x < xı andthe y-coordinate in the interval yo S y S yı, we con- 
sider the integrals 
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ne f(x, y) dx 


1(y) 


and 


[2 fe») ay, 


V4 (z) 


which are taken along the segments in which the lines y = constant 
and x = constant, respectively, intersect the region. Here ¢2(y) and 
¢1(y) denote the abscissae of the points in which the boundary of the 
region is intersected by the line y = constant, and we(x) and yi(x) the 
ordinates of the points in which the boundary is intersected by the 
lines x = constant. The integral 


[re fle, y) dx 


1(y) 


is therefore a function of the parameter y, where the parameter ap- 
pears both under the integral sign and in the upper and lower limits, 
and a similar statement holds for the integral 


[2 Ra, y)dy 


Yvı(z) 


as a function of x. The resolution into repeated integrals is then given 
by the equations 


az ff Aa, yy dR = |i dy fA Ax, 9) dz 


(£) 
— Í "l dy ("2 f(x, y) dy. 
x0 V1 (2) 


To prove this we first choose a sequence of points on the arc y = 
W(x), the distance between successive points being less than a positive 
number 5. We join successive points by paths, each consisting of a 
horizontal and a vertical line segment lying in R. The lower bound- 
ary y = vi(x), we treat similarly, choosing points with the same 
abscissae as on the upper boundary. We thus obtain a region R in R, 
consisting of a finite number of rectangles, where the boundary of R 
above and below is presented by sectionally constant functions y = 
We(x) and y = W(x), respectively (cf. Fig. 4.5). By the known theorem 
for rectangles, we have 
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Figure 4.5 


[Sf Aæ ar = [PF dx [02 Nay) dy. 


Since wi(x) and we(x) are uniformly continuous, as 6 > 0, the functions 
Wi(x) and We(x) tend uniformly to yi(x) and y2(x), respectively, and so, 


lim f°?” f(x, y) dy = fr a Ray y) dy 


5-0 vw 


uniformly in x. It follows that 


lim LAN 2) fey) dx = fora T v2 ty y) dx. 


-> (x) 


On the other hand, as 5 > 0, the region R tends to R. Hence, 


lim [f fx, dR = [f fx, 9) aR. 
Combining the three equations, we have 


[J fe» aR = S dx ff, 9) dy. 


The other statement can be established in a similar way. 

A similar argument is available if we abandon the hypothesis of 
convexity and consider regions of the form indicated in Fig. 4.6. We 
assume merely that the boundary curve of the region is intersected 
by every parallel to the x-axis and by every parallel to the y-axis in 
a bounded number of points or intervals. By f f(x, y) dy, we then mean 
the sum of the integrals of the function f(x, y) for a fixed x, taken over 
all the intervals that the line x = constant has in common with the 
closed region. For nonconvex regions the number of these intervals 
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GE 


Figure 4.6 Nonconvex regions of integration. 


may exceed unity. It may change suddenly at a point x = € (as in fig. 
4.6, right) in such a way that the expression f f(x, y) dy has a jump- 
discontinuity at this point. Without essential changes in the proof, 
however, the resolution of the double integral 


J| fle» aR = f de f fx,» dy 


remains valid, the integration with respect to x being taken along the 
whole interval xo < x < xı over which the region R lies. Naturally, 
the corresponding resolution 


IRE y) dR = | dy | f, y) dx 


also holds. 
In the example of the circle defined by x? + y? < 1, we have 


Figure 4.7 Circular ring as region of integration. 
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[fear = f dx f'y 


If the region is a circular ring between the circles x? + y? = 1 and x? 
+ y? = 4 (Fig. 4.7), then 


——— 


-_ f(x, y) dy. 


[J fle») dx dy = f; de f XE fx, y) dy + fi de [E fx, ») dy 


4 


4 


+1 V1-22 +1 + J/ 4-2 
+f) dx | ph Oy) dy+ f dx [7 f(x, y) dy. 


As a final example we take as the region R a triangle (Fig. 4.8) 
bounded by the lines x = y, y = 0, and x = a (a > 0). Integrating 
either first with respect to x, or with respect to y, we obtain 


y 


Figure 4.8 Triangle as region of integration. 


(13a) J| te nar = ["dx [" Ræ y)dy 
= | dy J f(x, y) dx. 
In particular, if f(x, y) depends on y only, our formula gives 


(13b) fide [T A) dy = f RONE- 3) ay. 


From this we see that if the indefinite integral IN f(y) dy of a function 


f(y) is integrated again, the result can be expressed by a single integral 
(cf. Volume I, p. 320). 
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d. Extension of the Results to Regions in Several Dimensions 


The corresponding theorems in more than two dimensions are so 
closely analogous to those already given that it is sufficient to state 
them without proof. If we first consider the rectangular region xo < 


< x <x, Yo LY LY, Zo S 2 < 21, and a function f(x,y,z) continuous 
in this region, we can reduce the triple integral 


V= ||| Aey, 2) dR 


in several ways to single integrals or double integrals. Thus, 


(14a) WJ, fle x 2) dR = [de ff Næ y, 2) dx dy. 


Here 
Í Í, f(x, y, z) dx dy 


is the double integral of the function taken over the rectangle B de- 
scribed by xo S x < xı, yo £ y < yı, 2 being kept constant as a para- 
meter during this integration so that the double integral is a function 
of the parameter z. Either of the remaining coordinates x and y can be 
singled out in the same way. 

Moreover, the triple integral V can also be represented as a re- 
peated integral in the form of a succession of three single integrations. 
In this representation we first consider the expression 


Í 1 F(x, y, 2) dz, 
zo 


x and y being fixed, and then consider 


Sn dy J f(x, y, z) dz, 


x being fixed. We finally obtain 


(14b) V= f. Á dx J i dy J f(x, y, 2) dz. 


In this repeated integral we could equally well have integrated first 
with respect to x, then with respect to y, and finally with respect to z 
and we could have made any other change in the order of integration, 
since the repeated integral is always equal to the triple integral. We 
therefore have the following theorem: 
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A repeated integral of a continuous function throughout a closed rec- 
tangular region is independent of the order of integration. 

The way in which the resolution is to be performed for nonrectan- 
gular regions in three dimensions scarcely requires special mention.! 
We content ourselves with writing down the resolution for a spherical 
region x? +y? + 22<1: 


a M f(x,y, 2)dxdydz = |" d xf r= d dy [e 2 f(x,y, 2) dz. 


V 1-22 V 1—xz2-y 


4.6 Transformation of Multiple Integrals 


a. Transformation of Integrals in the Plane 


The introduction of a new variable of integration is one of the chief 
methods for transforming and simplifying single integrals. The intro- 
duction of new variables is also extremely important for multiple in- 
tegrals. In spite of their reduction to single integrals, the explicit 
evaluation of multiple integrals is generally more difficult than for one 
independent variable and integration in terms of elementary func- 
tions is less likely. Yet often we can evaluate such integrals by in- 
troducing new variables in place of the original ones under the inte- 
gral sign. Quite apart from the question of the explicit evaluation of 
double integrals, the transformation theory is important for the com- 
plete mastery of the concept of integral that it gives us. 

The important special transformation to polar coordinates has al- 
ready been indicated on p. 378. Here we shall proceed at once to 
general transformations. First, we consider the case of a double inte- 
gral 


Sha f(x, y) dR = iii f(x, y) dx dy, 


taken over a region R of the x, y-plane. Let the equations 
x = ġ(u, v), y= y(u, v) 


give a 1-1 mapping of the region R onto the closed region R’ of the 
u, v-plane. We assume that in the region R the functions ¢ and whave 
continuous partial derivatives of the first order and that their Jacobian 
du $ 
Wu Wo 


D= = fuv — Yup 


1For a general proof, see the Appendix, p. 531. 


Multiple Integrals 399 


never vanishes in R. More precisely, we made the assumption, that 
the system of functions x = d(u, v), y = y(u, v) possesses a unique in- 
verse u = g(x, y), v = h(x, y) (p. 261). Moreover, the two families of 
curves u = constant and v = constant form a net over the region R. 

Heuristic considerations readily suggest how the integral 
fr f(x, y)dR can be expressed as an integral with respect to u and v. 
We naturally think of calculating the double integral ff f(x, y)dR by 
abandoning the rectangular subdivision of the region R and instead 
using a subdivision into subregions R; by means of curves of the 
net u = constant or v = constant. We therefore consider the values u = 
vh and v = uk, where h = Au and k = Av are given numbers and v and 
u take all integer values such that the lines u = vh and v = puk inter- 
sect R’ (so that their images are curves in FR). These curves define a 
number of meshes, and for the subregions R; we choose those meshes 
that lie in the interior of R (Figs. 4.9 and 4. 10). We now have to find the 
area of such a mesh. 


y 


Figure 4.9 Figure 4.10 


If the mesh, instead of being bounded by curves, were a parallelo- 
gram with vertices corresponding to the values (uv, Uy), (uv + A, vy), 
(Uy, Vu + k), and (uy + h, va + k), then by a formula of analytical geom- 
etry (cf. Chapter 2, p. 180) the area of the mesh would be the absolute 
value of the determinant 


gluv + h, Vua) — (Uv, Vu) (Uv, Va + k) — ġ(Uv, Vy) 
w(uy + h, Uy) _ w(uy, Uu) Wu, Up + k) ~ ww, Uy) 
which is approximately equal to 


du(Uy, Uy) (Uv, Up) 


hk = hkD. 
Wulwy, Uy) Wr( lv, Un) 
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On multiplying this expression by the value of the function f in the 
corresponding mesh, summing over all the regions R; lying entirely 
within R, and then passing to the limit as h > 0 and k > 0, we obtain 
the expression 


Je f(glu, v), w(u, v))| DI du du 


for the integral transformed to the new variables. 

This discussion is incomplete, however, since we have not shown 
that it is permissible to replace the curvilinear meshes by parallelo- 
grams or to replace the area of such a parallelogram by the expression 
|fuYv — Wu do[ hk; that is, we have not shown that the error introduced 
in this way vanishes in the limit as h > 0 and k —> 0. Instead of com- 
pleting the proof by making the proper estimates (which will be done in 
the Appendix), we prefer to prove the transformation formula in a 
somewhat different way, one that can subsequently be extended di- 
rectly to regions of higher dimensions. 

For this purpose, we use the results of Chapter 3 (p. 264) and per- 
form the transformation from the variables x, y to the new variables 
u, v in two steps instead of one. We replace the variables x, y by new 
variables x, v through the equations 


x=X, y= PV, x). 


Here we assume that the expression ®, vanishes nowhere in the region 
R, say, that ®, is everywhere greater than zero, and that the whole re- 
gion R can be mapped in a 1-1 way on the region B of the x, v-plane. 
We then map this region B in a 1-1 way on the region R’ of the u, v- 
plane by means of a second transformation 


x = Vu, v), U = U, 


where we further assume that the expression P,,is positive throughout 
the region B. We now effect the transformation of the integral 
Sir f(x,y) dx dy in two steps. We start with a subdivision of the region 
B into rectangular subregions of sides Ax = h and Av = k bounded 
by the lines x = constant = xv and v = constant = vu, in the x, v- 
plane. This subdivision of B corresponds to a subdivision of the region 
R into subregions Ri, each subregion being bounded by two parallel 
lines x = xy and x = x, + hand by arcs of the two curves y = ®(v,, x) 
and y = ®(u, + k, x) (Figs. 4.11 and 4.12). By the elementary inter- 
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y 
R 
O 
x 
Figure 4.11 Figure 4.12 
Y- Plvutk,x) 
Yy~D(Yp.%) 
L=Ly L=Lyth 
Figure 4.13 


pretation of the single integral, the area of the subregion (Fig. 4.13) 
is 


AR; = f VEE Bva + k, x) — P(vy, 2] dx. 


Ty 


By the mean value theorem of the integral calculus, this can be 
written in the form 


AR; = h[P(vp + k, Xv) — (vu, x), 


where Xv is a number between xv and xy + h. By the mean value theo- 
rem of the differential calculus, this finally becomes 


AR; = hk®,(vy, Xv), 


in which ï, denotes a value between vz and vz + k, sothat (dp, Xp) are 
the coordinates of a point of the subregion in B under consideration. 
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The integral over R is therefore the limit of the sum 
2 fi ARi = X hhf (Sv, Dün, Xv))Po( Hy, Xv) 


as h > 0, k > 0. We see at once that the expression on the right tends 
to the integral 


S A) dsdo (y = (, x) 


taken over the region B. Therefore, 


Rice y) dx dy = {ff y)®y dx dv. 


To the integral on the right we now apply exactly the same argument 
as that just employed for ffr f(x, y) dx dy and transform the region B 
into the region R’ by means of the equations x = V(u, v), v = v. 

The integral over B then becomes an integral over R’ with an inte- 
grand of the form f(x, y) ®)Yu, namely, 


ff f(x, y)Py du dv. 


Here the quantities x and y are to be expressed in terms of the inde- 
pendent variables u and v by means of the two transformations above. 
We have therefore proved the transformation formula 


(16a) Í IR f(x, y) dx dy = Í l, f(x, YYY u du dv. 


By introducing the direct transformation x = ¢(u, v), y = y(u, v) the 
formula can at once be put in the form stated previously. For 


d(x, y) _ _ 
d(x, v) Py and d(u, v) 


and so, by Chapter 3 (p. 258), we have 


d(x, y) 
D = > = Ya. 

d(u, v) ue 
We have therefore established the transformation formula whenever 
the transformation x = ¢(u, v), y = y(u, v) can be resolved into a suc- 
cession of two primitive transformations of the forms! x = x, y = 
@(v, x) and v = v, x = Py, v). 
1We have assumed above that the two derivatives ®, and ®, are positive, but weeasily 


see that this is not a serious restriction. If it is not satisfied, we merely have to re- 
place ®,,, by its absolute value in formula (16a). 
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In Chapter 3 (p. 265), however, we saw that for D + 0 we can sub- 
divide a closed region R into a finite number of regions in each of 
which such a resolution is possible, except perhaps that it may be 
necessary to interchange u and v, but this does not affect the value of 
the integral. We thus arrive at the following general result: 


If the transformation x = ¢(u, v), y = y(u, v) represents a continuous 
1-1 mapping of the closed Jordan-measurable region R of the x, y-plane 
on a region R' of the u, v-plane, and if the functions ¢ and y have con- 
tinuous first derivatives and their Jacobian 


d(x,y) _ 
d(u, v) = Pu v — Yupo 


is everywhere different from zero, then 


d(x, y)| 


dlu v) du dv. 


(16b) [f(x,y dxdy = || figu, v), wu, v) 


For completeness we add that the transformation formula remains 
valid if the determinant d(x, y)/d(u, v) vanishes without reversing its 
sign at a finite number of isolated points of the region, for then we 
have only to cut these points out of R by enclosing them in small cir- 
cles of radius p. The proof is valid for the residual region. If we then 
let p tend to zero, the transformation formula continues to hold for 
the region R by virtue of the continuity of all the functions involved. 
This fact permits us to introduce polar coordinates with the origin in 
the interior of the region; for the Jacobian, being equal to r, vanishes 
at the origin. 

In Chapter 5 we shall return to transformations of integrals and 
assign a role to the sign of the Jacobian in connection with integrals 
over oriented manifolds. A different method of proving the transforma- 
tion formula will be given in the Appendix. 


b. Regions of More than Two Dimensions 


We can, of course, proceed in the same way with regions in space of 
three or more dimensions and obtain the following general result: 


If a closed Jordan-measurable region R of x, y, z, . . . -space is 
mapped on a region R' of u, v, w, . . . -space by a 1-1 transformation 
whose Jacobian 


dlx, y,2,...) 
d(u, v, w,.. .) 
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is everywhere different from zero, then the transformation formula 


(17) ff- J, fenne..)dudy dex. 


= ff. . | fx, I, 2. ..) gon B) dududw... 


d(u,v,W,... 


holds. 
As a special application, we can obtain the transformation formulas 
for polar and spherical coordinates. For polar coordinates in the plane, 


we write r and 8 instead of u and v, and at once obtain a a =r 


(cf. p. 253). For the spherical coordinates in space, defined by the 
equations 


x=rcos¢sin®, y= rsinøsin9, = r cos 0, 


in which ¢ ranges from 0 to 2r, 0 from 0 to x, and r from 0 to + œ, we 
identify u, v, w with r, 0, ø; for the Jacobian we then obtain 


cos¢sin® rcos¢gcos®8 —rsingsin@ 
A(x, y, z) _ 


oor | _ i. 
dir, 0, 4) singsin® rsingcos®@ rcos¢sin@| = r? sin ð. 


cos ĝ —r sin 9 0 


(The value r? sin 0 is easily obtained by expanding in terms of the 
minors of the third column.) The transformation to spherical coordi- 
nates in space is therefore given by the formula 


I fee y, 2) dx dy dz = iE f(x, y, z)r? sin 0 dr dé dg. 


As in the corresponding case in the plane, we can also arrive at the 
transformation formula without using the general theory. We have 
only to start with a subdivision of space given by the spheres r = con- 
stant, the cones 0 = constant, and the planes ¢ = constant. The de- 
tails of this elementary method can be left to the reader. 

For spherical coordinates our assumptions are not satisfied when 
r = 0 or 8 = 0, msince the Jacobian then vanishes. Asin the case of the 
plane, we can easily convince ourselves that the transformation for- 
mula nonetheless remains valid. 


Multiple Integrals 405 


Exercises 4.6 
1. Perform the following integrations: 


(a) f° f, 90 — y?) dy dx 
(b) ff cos( + y) dy dx 
() N fr zdy dx 

(d) in f xetY dy dx 

(e) ff? x dy dx. 


O [of ydyde 


2. f Í x?y2 dx dy over the circle x? + y? <1. 


x? + y? — 3xy(x? + y’) . 
3. Í i) "Gap 9232 dx dy over the circle x? + y? <1. 


4. Find the volume between the x, y-plane and the paraboloid z= 
2— x? — y?. 
5. Evaluate the integral 


reer: dx dy 
(L + x2 + y2)2 
taken 


(a) over one loop of the lemniscate (x? + y2)2 — (x2 — y?) = 0, 
(b) over the triangle with vertices (0, 0), (2, 0), (1, v3). 
6. Evaluate the integral 


fff lxyz| dx dy dz 


taken throughout the ellipsoid x?/a? + y?/b? + z?/c? <1. 
7. Find the volume common to the two cylinders x? + z? < 1 and y? + 2? 
<1. 


8. By integration, find the volume of the smaller of the two portions into 
which a sphere of radius r is cut by a plane whose perpendicular dis- 
tance from the center is h(<r). 


9. f Í ji (x? + y? + 22) xyz dx dy dz throughout the sphere x? + y2? + 22 < r2, 


10. Í f f z dx dy dz throughout the region defined by the inequalities x? + y? 
<2, x Hy H2 <1. 
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11. f ji f (x + y + 2) x2y?z22 dx dy dz throughout the region x + y + 2z <1, 
x20,y7y20,z220. 


dx dy dz 
12. M 4 y? e (z — 22 throughout the sphere x? + y? + 2? <1. 


dx dy dz 
13. il + y? EE ENTIN pe throughout the sphere x? + y2? + 22 <1 


d 
14. IS ———} over the square |x| <1, |y| <1. 


15. Prove that if f(x, y) is a continuous function on a domain D in 
the x, y-plane and if for every region R contained in that domain 
Srf (x, y) dx dy = 0, then f(x, y) is identically 0. 

16. Prove that 

—u2 

a? + u? du 

where R denotes the half-plane x Z a > 0, by applying the trans- 

formation 


Í f e-(22+y2) dy dy = ae% f 


x? + y2? =u? +a, y=ux. 


17. Prove that 
| f f (uz? + uy?) dx dy 


is invariant on inversion. 
18. Evaluate the integral 


I = fff cos (x6 + yn + 20) dé dn dv 


taken throughout the sphere č? + 7? + ¢? <1. 
19. In the integral 


(20—42z) /(8— D 
I= i dx J — 4) dy 
change the order of integration and evaluate the integral. 


4.7 Improper Multiple Integrals 


In the case of functions of one variable, we found it necessary to 
extend the concept of integral to other functions that are not con- 
tinuous in the interval of integration. In particular, we considered the 
integrals of functions with jump-discontinuities and of functions with 
infinite values; we also considered integrals over infinite intervals of 
integration. The corresponding extensions of the concept of integral 
for functions of several variables will now be discussed. 
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The notion of “integral”, as defined on p. 377 (we call it the Rie- 
mann integral), is not tied to continuity of the integrand f(x, y). As 
long as fis bounded in the region of integration R, we can always form 
the upper and lower sums corresponding to a division of R into Jor- 
dan-measurable sets R;. We call f integrable (more precisely Riemann- 
integrable) if these upper and lower sums approach the same limit as 
the division of R is refined indefinitely. This is essentially the proce- 
dure we shall follow in the exposition given in the Appendix to this 
chapter.! Strictly speaking the integral of any integrable function is 
proper, even if the function happens to be discontinuous. 

In this section, however, we take only the existence of integrals of 
continuous functions for granted and try by limiting processes to ex- 
tend the notion of integral and to prove its existence for wider classes 
of functions. We leave open the question whether improper integrals 
defined in this way are really identical with proper Riemann integrals 
obtained directly from upper and lower sums of subdivisions of R.? 


a. Improper Integrals of Functions over Bounded Sets 


The functions we aim to integrate are, in most cases, continuous in 
a certain region R except at isolated points or along certain curves, 
where the functions are not defined or are unbounded, or where their 
continuity is doubtful. In all cases that interest us the set of points of 
exceptional behavior for the function has area 0 (the word “area” is 
used here exclusively in the sense of Jordan-measure or content).? 
We may then cut away from R a set s of small area containing the ex- 
ceptional points, integrate f over the remainder, and take the limit 
of the integrals of f over R — s as the area of s tends to 0. If this limit 
exists, it defines the “improper” integral of f over R. Since we do not 
want the limit to depend on the particular way in which we approx- 
imate the set R, we shall confine ourselves to the simplest situation 
(corresponding to “‘absolute convergence” in contrast to ‘conditional 
convergence” in infinite series) where not only f but also |f], has an 
improper integral. 


Let the region of integration R be bounded and have an area. Assume 
that we can find a “monotone” sequence of closed subregions R,(i.e., 


1We there use only subdivisions into squares in defining the integral. But this re- 
striction can be shown to be inessential. 

2This actually always is the case when fis bounded and is continuous except possibly 
on a set of points of content 0, provided R is bounded and Jordan-measurable. 
3More refined notions, like the Lebesgue integral, are needed to integrate some 
functions whose points of discontinuity form a set of positive Jordan measure. 
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Rn C Ray C R) in each of which f(x, y) is defined and continuous. As- 
sume moreover that the areas A(Rn) of the sets Rn approach the area 
A(R) and that the integrals 


(19a) JJ, 1&3) dx dy 


are bounded independently of n. Then 
(19b) I= lim |] f(x,y) dx dy 


exists. This limit will be shown to be independent of the particular ap- 
proximating sequence Rn, and will be used to define the improper inte- 
gral 


(19c) I= J f(x, y) dx dy. 


Before proving this theorem, we illustrate the ideas by some typical 
examples. 
The function 


f(x, y) = log Vx? + z 


becomes infinite at the origin of the x, y-plane. Therefore, in order to 
calculate the integral of f over a region R containing the origin, for 
example, over the circle x? + y2 < 1, we must cut out the origin by 
surrounding it with a region s whose area tends to 0. We must then 
investigate the convergence of the integral taken over the residual 
region R — s. We take for s the circular disk sn of radius 1/n. Let Rn 
be the region obtained from F by cutting out Sn Let, in turn, R be 
contained in a circle of radius p about the origin. Transforming to 
polar coordinates, we have 


fj, Wax dy = ff, Ifirdrdo < f°, dr fy” do rllogr] 
= On fi, r|log r|dr. 


The transformation thus yields a new integrand r|log r| thatis bound- 
ed and even continuous if defined as 0 for r = 0. Hence, uniformly 
for all n, 
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lfldx dy < 2 iM ilog r|dr 
Woe xdy S 2n | rllogr|dr. 
The existence of the improper integral 
ff, log vx? + y? dx dy = lim {J log vx? + y? dx dy 


follows. For example, if R is the unit disk we find 


amm 1 an 
(20a) JJ 22, log Vx? + y? dx dy = Í, dr Í, dð r logr 


1 
= 2r |, rlogrdr 

= 2r (37210 r= 1r), 
7 2 8 4 0 
_ _f 

=—>5- 


As a further example, we consider the integral 


(20b) Seter F 


taken over the same region. Here we obtain immediately 


[J Ifidædy < f? dr f°" a0 fir dr do 
= 20 Í. H ri-a dr. 


From Volume I (p. 305) we know that the integral fọ r!—Ħ dr is conver- 
gent if and only if a < 2. We therefore conclude that the double inte- 
gral (20b) likewise is convergent if and only if a < 2. This remark 
can readily be extended into a sufficient (but by no means neces- 
sary) criterion for the convergence of improper double integrals, 
which is applicable in many special cases. 


If the function f(x, y) is continuous in the region R everywhere 
except at one point, which we take as the origin, and if there exists a 
fixed bound M and a positive number a < 2 such that 


M 
(2la) [fæ y)|< JOE F ya 
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everywhere in R for (x, y) # (0, 0), then the integral 


(21b) J| fla.) dx dy 


converges. 
We can treat the triple integral 


Meere 


in a similar way. If R contains the origin, we introduce spherical 
coordinates and obtain 


Í li ii r?— sin 0 dr d¢ dO. 


A discussion similar to the preceding one shows us that convergence 
occurs when a < 3. Again, more generally, we see that 


(22a) Í i) ik f(x, y, z) dx dy dz 


converges if f(x, y, z) is continuous in R except at the origin provided 
that there exists a bound M and a constant a < 3 for which 


M 
(22b) | f(x,y, Z) | s V(x? + y2 + 23e 


In consequence, for an everywhere continuous function g(x, y, z), the 
improper integral 


g(x, y, 2) 
(220) fife Te Ta > pi de dy dz 


exists, if a < 3. Improper integrals can also exist for integrands that 
are infinite along whole curves, not only at single points. In the 
simplest case, the integrand is infinite on a portion of a straight line, 
say a segment of the y-axis. In this case, if the relation 


(23) | Ife I< TT 


ec 


is valid everywhere in R for x # 0, where M is a fixed bound and 
a < 1, then again the improper integral of f over R exists. For the 
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proof, we only have to cut out from È a strip about the y-axis and let 
the width of the strip tend to 0. 


Integrals like 
f dx dy 
R x ? 


violating our restriction on the exponent a, may sometimes still be 
defined in a “conditional” sense, in which the value depends on the 
precise manner of approximation to R. Here, for example, the integral 
can be defined as the limit of integrals over the regions obtained by 
cutting out of R a strip symmetric to the y-axis. Other approximations 
may lead to different values for the integral or even to divergence. 


b. Proof of the General Convergence Theorem for Improper 
Integrals 


We consider the set R of area A(R) and a sequence of closed subsets 
Rn whose areas A(Rn) tend to A(R) for n > oo. Here the Rn shall ex- 
pand monotonically inside R: 


(24a) RhicReCR3C-+-+CR. 


The function f(x, y) is assumed to be continuous in each Ry. Moreover, 
there shall exist a constant u such that 


(24b) Í fp, If(x, y)| dx dy S p 


for all n. 
Because of (24a) the integrals 


IJ, IfI dx dy 


obviously form a monotone increasing bounded sequence and thus 
have a limit for n > œ. By the Cauchy convergence test, for every 
€ > 0 we can find an N = N(e) such that, for m > n > Ne), 


(4c) ff, Ifl dedy- [fa Ifl dedy = ffy p, Ifl dedy <e. 
Let 


In = I, f(x, y) dx dy. 
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Clearly the I also satisfy the Cauchy test, since, by (5g), 


Sa tady- SJ, faz dy| =| ffr p fard) 


< Sfor, lf] dx dy < € 


for m > n > N(e). It follows that 


I= lim Sfo fe y) dx dy 


exists. 

It remains to be shown that the value J does not depend on the 
particular approximating sequence Ry used. Let S be any closed 
Jordan-measurable subset of R in which f is continuous. Let M be 
an upper bound for |f| in S. Then, by the mean value theorem of 
integral calculus (see p. 384), 


SJ, faxdy - [J fdedy = Sen, Fox a 


< ff „ Ifldx dy < MA(S — Rn) £ MAIR — Rn) 
= M[A(R) — A(R,)). 
It follows from our assumption lim A(Rn) = A(R) that 


(24d) f| fax ay = lim Voor, f 42 dy 


Applying this relation to |f| instead of f, and using (24b), we find 
(24e) [J ifldx dy = tim ff Ifldx dy 
<lim ff; Iflddy < p. 


Thus, the estimate (24b) has been extended to more general subsets 
S of R. 
We can also extend (24c). We have, using (24d). 


1We remind the reader that S N Rn stands for the set of points common to S and Rn 
and S — Rn for the set of points that belong to S but not to Rn (see p. 116): 

S — Re=S— SN Rn 
We write again A(S — Rn) for the area of the set S — Rn. 
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(42f) I f dx dy — Sion, f dx dy | 


= lim Doorn! dx dy — Deren f dx dy | 
= Bi [Won nn £2915 tim, n, f [2ed 


= lim (ff, Ifldxdy — JJ lfldxdy) < € 


for n > N(e). Here N does not depend on the particular set S. 


Let now Si, S2, . . . be a sequence of closed subsets of R in which 
f is continuous and for which 
(24g) SıC CC.. CR 
and 
(24h) lim A(Sm) = A(R). 
Since by (24e) 


JJ; Wfldx dy < u, 
m 
we know that 


J = lim J, fax dy 


mo 


exists. Then 
IJ — J, fax dy! <e 
for all sufficiently large m. It follows from (24f) that 
| J — || an ta dy|< 2e 


for all m, n that are both sufficiently large. Interchanging the roles 
of the Sm and Rn, we also have 


I Sy nn, fax al < 2 


for all sufficiently large m, n. Hence, |J — I| < 4e for any positive 
number g, and thus, I = J, which was to be proved. 
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c. Integrals over Unbounded Regions 


A different type of improper integral arises when the integrand f 
is continuous but the region of integration extends to infinity. Again, 
we do not try to analyze the most general situation but formulate a 
convergence criterion applicable to most cases occuring in practice. 
It is sufficient to treat the case of two independent variables. 

We consider an unbounded set R in which the function f is con- 
tinuous. We exhaust R by a monotone sequence of subsets 


RcReCRsC---CR 


each of which is closed, bounded, and Jordan-measurable. Instead of 
the previous condition lim A(R,) = A(R), which might make no sense 
n-o 


for unbounded R, we require that every closed and bounded subset of 
R is contained in at least one of the sets Rm. (If, for example, R is the 
whole plane, we can choose for the Rn» the circular disks of radius n 
with center at the origin.) If the limit 


lim ffp fle,y) dx dy 


exists and is independent of the particular choice of the sequence of 
subsets Rn, we call it the integral of f over R and denote it by 


f| fax dy. 


We then have the following sufficient condition for existence of the 
integral: 


The improper integral of f over the unbounded set R exists if for one 
particular sequence Rn (of the type described) the integrals of|f|over 
Rn are bounded uniformly in n, say if 


IJ, iflde dy < y 


for all n. 

The proof of this convergence criterion uses the same arguments 
as the one for improper integrals over bounded sets, and should be 
carried out as an exercise by the reader. 

We illustrate the theorem with the integral 


[om dees 
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where the region of integration is the whole x, y-plane. We choose for 
the sequence Rn of subregions the circular disks of radius n with 
center at the origin that obviously satisfy all our requirements. Here, 
transforming to polar coordinates: 


e- 22-42 dx dy = e-22-y2 dx dy 
Vr li) 


tyan 


= iN dr (" dð re-*? dr = 27 iN re dr 


= =ne’? 


n 
_ — p-n? 


This proves the boundedness of the integrals over Rn and, hence, the 
existence of the integral over R. For n > œ we find for the value of 
our improper integral 


Í Í. e-*2-y2 dx dy = lim n(1 — e?) = q. 


On the other hand, we must obtain the same limit by using instead of 
the Rn the sequence Sm of squares 


—msxsitm, -m Sys +m. 


Here we can make use of the fact that the integrand is a product of a 
function of x and of a function of y (see p. 380) and find 


Í J N e-22-¥2 dx dy = Í J, - e732 . ev? dx dy 


= [Temas (fp ev ay] = (Je ax) 


It follows that 


lim ffy e7 dedy = [7 e dx}. 


oo 


Since the Rn and Sm must yield the same value for the integral over 
R, we find that | 


(25a) f ee dx = yr 
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By using the theory of improper double integrals we have thus evalu- 
ated an improper single integral that is of great importance in analy- 
sis. This value is difficult to find directly since the indefinite integral 
of e77? cannot be expressed in terms of elementary functions. 

We can make use of this result to evaluate the gamma function (see 
Volume I, p. 308) 


(25b) T(n) = f ettr- dt 


for the argument n = 4. The substitution t = x? yields 
_ (~ -a2 
(25c) r (3) = , 7a =dt=2[" 22 dx 
= [e72 dx = yr. 


We can formulate useful convergence tests for improper integrals 
over unbounded regions by comparison with powers of vx? + y2. These 
are analogous to the test found on p. 409 for functions that are un- 
bounded near the origin. We find that the improper integral of a 
continuous function f(x, y) over an unbounded region R exists if f 
everywhere in R satisfies an inequality 


M 
(26) Ife MIS Teepe 


where M and a are fixed constants and a > 2.1 


Exercises 4.7 


1. (a) By transforming to polar coordinates, show that the value of the 
integral 


K= [ome tee log(x? + y2) dx} dy o<8<3] 
is a?ß(log a — }). 


1Behavior at infinity and at the origin are “complementary” in the sense that f is 
integrable near the origin if (26a) holds for a value a < 2. Thus, the improper integral 


J Va +y" CET me 


extended over the whole plane exists for no value of a. 
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(b) Change the order of integration in the original integral. 
2. Integrate 


1 
(a) i ji G2 + y24 D2 dx dy over the x, y-plane, 


1 
(b) Í i i) G@+y+e224 1? dx dy dz over x, y, 2-space. 


3. Show that the order of integration in 


t= Eig CTY ds} dy 


cannot be reversed. 


4.8 Geometrical Applications 


a. Elementary Calculation of Volumes 


The concept of volume forms the starting-point of our definition of 
“integral.” Here we use multiple integrals in order to calculate the 
volumes of several solids. 

For example, in order to calculate the volume of the ellipsoid of 
revolution 


we write the equation in the form 
64 ——_—_— 
z= + va? — x? — y’. 


The volume of the half of the ellipsoid above the x, y-plane is therefore 
given by the double integral [see (3b)], 


r= J — x j dx dy 


taken over the circle x? + y? < a?. If we transform to polar co- 
ordinates, the double integral becomes 


Í r va? — r2 dr dé , 
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whence, on resolution into single integrals 


V b 2r a -5—7 _ b a aaan 
z = 7, do | r va FP dr = ane | r va? — r dr, 


which gives the required value, 
V= : tab. 
To calculate the volume of the general ellipsoid 
(27a) Stata =l 


we make the transformation 


x = ap cos 9, y = bp sin 9, ae = abp 


and for half the volume obtain 


ye ff J -2-7 jp dx dy = abe [f, p VI — p? dp do. 


Here the region F’ is the rectangle 0 < p < 1,050 < 2r. Thus, 


V — 
z7 = abe f" do f ov1 — p? dp = = nabe 


or 
4 
(27b) V= 3 tabc. 
Finally, we shall calculate the volume of the pyramid enclosed by 
the three coordinate planes and the plane ax + by + cz — 1 = 0, 


where we assume that a, b, and c are positive. For the volume we 
obtain 


V= ale — ax — by) dx dy, 


where the region of integration is the triangle0 < x < 1/a,0 S y S 
(1 — ax)/b in the x, y-plane. Therefore, 
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1 l/a (1—az)/b 
V = a dx f (1 — ax — by)dy. 


Integration with respect to y gives 


(1—az)/b (1 — ax)? 
0 7 2b ? 


(1 — ax)y — 2 2 


and if we integrate again by means of the substitution 1 — ax = t, 
we obtain 


1 


l/a 2d 1 1 alte 
V= z, (1 — ax) x=- oba, — ax) o 


This result agrees, of course, with the rule of elementary geometry 
that the volume of a pyramid is one-third of the product of base and 
altitude. 

In order to calculate the volume of a more complicated solid we 
can subdivide the solid into pieces whose volumes can be expressed 
directly by double integrals. Later, however (in particular in the next 
chapter), we shall obtain expressions for the volume bounded by a 
closed surface that do not involve this subdivision. 


b. General Remarks on the Calculation of Volumes. Solids of 
Revolution. Volumes in Spherical Coordinates 


Just as we can express the area of a plane region R by the double 


integral 
J| aR = |f dx dy, 


we may also express the volume of a three-dimensional region R by the 


integral 
V = |||, dx dy dz 


over the region R. In fact this point of view exactly corresponds to our 
definition of integral (cf. Appendix, p. 517) and expresses the geo- 
metrical fact that we can find the volume of a region by cutting space 
into identical cubes, finding the total volume of the cubes contained 
entirely in R, and then letting the diameter of the cubes tend to zero. 
The resolution of this integral for V into an integral f dz ff dx dy 
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[see (14a), p. 397] expresses Cavalieri’s principle, known to us from 
elementary geometry, according to which the volume of a solid is deter- 
mined if we know the area of every plane cross section that is perpen- 
dicular to a definite line, say the z-axis. The general expression given 
above for the volume of a three-dimensional region enables us at once 
to find various formulae for calculating volumes. For this purpose, 
it often is useful to introduce new independent variables into the 
integral instead of x, y, z. 

The most important examples are given by spherical coordinates 
and by cyclindrical coordinates. Let us calculate, for example, the 
volume of a solid of revolution obtained by rotating a curve x = g(2) 
about the z-axis. We assume that the curve does not cross the z-axis 
and that the solid of revolution is bounded above and below by planes 
z = constant. The solid is therefore defined by inequalities of the 
form a <z <b and 0 < vx? + y? < d(z). Its volume is given by the 
integral above. In terms of the cylindrical coordinates 


Zz P= v2 + y’, 8 = arc cos” = arc sin $ 


the expression for the volume becomes 


V = [|| dx dy de = f; dz f" do |“? pdp. 


If we perform the single integrations, we at once obtain 
b 
— 2 
(28a) Ven f g(2)2 dz. 


We can also give a more intuitive derivation of this formula (see 
Volume I, p. 374). We cut the solid of revolution into small slices 


Zv < zZ < evil 
by planes perpendicular to the z-axis, and we denote by my the mini- 
mum and by M, the maximum of the distance ¢(z) from the axis in this 
slice. The volume of the slice lies then between the volumes of two 
cylinders with altitude 


Az = 2v41 — Zv 


and radii my and My, respectively. Hence, 
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Yi mn Az = V < >) Mv’ Az. 


By the definition of the ordinary integral, therefore, 
b 
— 2 
V=r ih ġ(z)? dz. 


If the region R contains the origin O of a spherical coordinate 
system (r, 9, ø) and if the surface is given by an equation 


r = f, 9) 


where the function f(0, ¢) is single-valued, it is frequently advantage- 
ous to use these spherical coordinates instead of (x, y, z) in calculating 
the volume. If we substitute the value of the Jacobian 


d(x, y, 2) 
d(r, 9, ¢) 


(as calculated on p. 000) in the transformation formula, we at once 
obtain the expression 


V = {IJ 7? sin 0 dr do dg = | dg f7 sino do [>° redr 


for the volume. Integration with respect to r gives 


= r? sin 0 


(28b) v=: f "dg f7 PO, 9) sino ao. 


In the special case of the sphere, for which f(9, ¢) = Ris constant, this 
at once yields the volume (4/3)zR°. | | 


c. Area of a Curved Surface 


We expressed the length of a curve by an ordinary integral (Volume 
I, p. 349). We now wish to find an analogous expression for the area 
of a curved surface by means of a double integral. We defined the 
length of a curve as the limiting value of the length of an inscribed 
polygon when the lengths of the individual sides tend to zero. This 
suggests that we define the area of a surface analogously as follows: 
In the curved surface we inscribe a polyhedron formed of plane 
triangles, determine the area of the polyhedron, make the inscribed 
net of triangles finer by letting the length of the longest side tend to 
zero, and seek to find the limiting value of the area of the polyhedron. 
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This limiting value would then be called the area of the curved 
surface. It turns out, however, that such a definition of area would 
have no precise meaning, for in general this process does not yield a 
definite limiting value. This phenomenon may be explained in the 
following way: a polygon inscribed in a smooth curve always has the 
property (expressed by the mean value theorem of the differential 
calculus) that the direction of the individual side of the polygon ap- 
proaches the direction of the curve as closely as we please if the sub- 
division is fine enough. With curved surfaces the situation is quite 
different. The sides of a polyhedron inscribed in a curved surface may 
be inclined to the tangent plane to the surface at a neighboring point 
as steeply as we please, even if the polyhedral faces have arbitrarily 
small diameters. The area of such a polyhedron, therefore, cannot by 
any means be regarded as an approximation to the area of the curved 
surface. In the Appendix we shall consider an example of this state of 
affairs in detail (pp. 540). 

In the definition of the length of a smooth curve, however, we can, 
instead of using an inscribed polygon, equally well use a circumscribed 
one, that is, a polygon of which every side touches the curve. The 
definition of the length of a curve as the limit of the length of a 
circumscribed polygon can easily be extended to curved surfaces, if 
first modified as follows: we obtain the length of a curve y = f(x) that 
has a continuous derivative f'(x) and lies between the abscissae a and 
b by subdividing the interval between a and b at the points xo, x1, .. ., 
Xn into n equal or different parts, choosing an arbitrary point & in 
the vth subinterval, constructing the tangent to the curve at this 
point, and measuring the length J, of the portion of this tangent lying 
in the strip xv S x < xv41(Fig. 4.14). If we let n increase beyond all 


x 0 %1 1 x2 È2x3 3 x4 


Figure 4.14 
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bounds and at the same time let the length of the longest subinterval 
tend to 0, the sum 


then tends to the length of the curve, that is, to the integral 
b 
[VTP CR dz. 
This statement follows from the fact that 


l= (Xv41 — Xv) v1 + f’'(Ev)?. 


We now define the area of a curved surface similarly. We begin by 
considering a surface represented by a function z = f(x,y) with 
continuous derivatives on a region È of the x, y-plane. We subdivide 
R into n subregions Ri, Re, . . ., Rn with the areas Afi, . . ., ARn, 
and in these subregions we choose points (€1, n1), . . -, (En, Nn). At the 
point of the surface with the coordinates &v, nv and Cy = f(Ev, nv) we 
construct the tangent plane and find the area of the portion of this 
plane lying above the region Ry (Fig. 4.15). If av is the angle that the 
tangent plane 


Z — Gv = falEv, n(x — Gv) + fulév, nY — Ty) 


makes with the x, y-plane and if At, is the area of the portion ty of the 


Figure 4.15 
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tangent plane above Ry, then the region fy is the projection of ty on 
the x, y-plane,! so that 


AR, = Aty cos Oy. 
Again (cf. Chapter 3, p. 239), 


1 


COS Qy = 


and therefore, 
Aty = V1 + fa2(Ev, nv) + fy2(Ev, nv) © ARy. 


We form the sum of all these areas 
n 
») Aty 
v=] 


and let n increase beyond all bounds, at the same time letting the 
diameter of the largest subdivision tend to zero. According to our 
definition of “integral” this sum will have the limit 


(29a) A = ||, VIFTE FF? dR. 


This integral, which is independent of the mode of subdivision of the 
region R, we now use to define the area of the given surface. If the 
surface happens to be a plane surface, this definition agrees with the 
preceding; for example, if z = f(x, y) = 0, we have 


A= || ap. 


It is occasionally convenient to call the symbol 


1The fact that the area of a plane set is multiplied on projection onto another plane 
with the cosine of the included angle a is a consequence of our general substitution 
formula for integrals. We can introduce Cartesian coordinate systems x, y and X, Y 
in the two planes such that the y- and Y-axes coincide. The projection of a point 
(X, Y) onto the x, y-plane then has coordinates x = X cos a, y = Y. Hence, the pro- 


jected area is 
_ (f d(x,y) ii 
ff dx dy = || oP yyy = dX dY cosa. 
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= VI+ f2 +f? GR = VIF fe + fy? ax dy 


the element of area of the surface z = f(x, y). The area integral can then 
be written symbolically in the form 


JJ, ao. 


We arrive at another form of the expression for the area if we think 
of the surface as given by an equation g(x, y, z) = 0 instead of z = 
f(x, y). If we assume that øz + 0, on the surface the equations 

Oz Dx Oz _ Py 


— m —— 


ax bz’ dy be 


at once give the expression 


(29b) [hi vers oP Ge g dx dy 


for the area, where the region R is again the projection of the surface 


on the x, y-plane. 
Let us apply the area formula to the area of a spherical surface. The 


equation 
z= /R--y¥ 
represents a hemisphere of radius R. We have 


rr. 2 fz y 
dx — VRi—at— yt’? dy — VRP— at — yi 


The area of the full sphere is therefore given by the integral 


= dxdy __ 
A=2R || VR? — x? — y2’ 


where the region of integration is the circle of radius R lying in the 
x, y-plane and having the origin as its center. Introducing polar co- 
ordinates and resolving the integral into single integrals we obtain 


ordre _ rdr 
A=2R f ao f a 3 = amr f a 


The ordinary integral on the right can easily be evaluated by means 
of the substitution R? — r? = u; we have 
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A= —4nR yR? — rle = 47 R?, 


in agreement with the result of Archimedes. 

In the definition of “area”, we have hitherto singled out the co- 
ordinate z. If the surface had been given by an equation of the form 
x = x(y,z)ory = y(x, z), however, we could have represented the area 
similarly by the integrals 


[[ Fert atdyde or  [[VI Fy Fy? dedx 


or, if the surface were given implicitly, by 


(290) IEZETETA 5, |de de 
or 
(294) IETA? ra dz. 


That all these expressions do actually define the same area can be 
verified directly. To this end, we apply the transformation 


x = x(y, 2), 
y=y 
to the integral 


f Ba? + by® +g? fe + fy fe + ge" dx dy. 


Here x = x(y, z) is found by solving the equation g(x, y, z) = 0 for x. 
The Jacobian is 


d(x, y) _ ge 
d(y, z) $2’ 


and therefore, 


ee va? + by? + $e" + fe + ay dy = ff Vga? + py? + ge" vie or + Be" ay de, 


The integral on the right is to be taken over the projection FR’ of the 
surface on the y, z-plane. 
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If we wish to get rid of any special assumption about the position 
of the surface relative to the coordinate system, we must represent the 
surface in the parametric form 


x= (u,v) y= y(u, v), z= x(u, v) 


and express the area of the surface as an integral over the parameter 
domain R. A definite region R of the u,v-plane then corresponds to 
the surface. In order to introduce the parameters u and v in (29a), we 
first consider a portion of the surface near a point at which the 
Jacobian 


d(x,y) _ 
d(u, v) D 


is different from zero. For this portion we can solve for u and v as 
functions of x and y and obtain (see p. 261) 


Us = u=- 7, 
w=-%, v=% 


z ooz dz 02 az oz 
ax = bu Us; + = dv l and ay = Ju Uy + Uy 


we obtain the expression 


Jala) +) 


If we now introduce u and v as new independent variables and apply 
the rules for the transformation of double integrals (16b), p. 403 we 
find that the area A’ of the portion of the surface corresponding 
to a parameter region R’ is 


Al = J V (Guo — Yugo)? + WuXo — XuWo)® + (Kuso — fux)? du dv. 
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In this expression no distinction appears between the coordinates x, 
y, and z. Since we arrive at the same integral expression for the area 
no matter which one of the special nonparametric representations we 
start with, it follows that all these expressions are equal and rep- 
resent the area. 

So far we have only considered a portion of the surface on which 
one particular Jacobian does not vanish. We reach the same result, 
however, no matter which of the three Jacobians does not vanish. If 
then we suppose that at each point of the surface at least one of the 
Jacobians is not zero, we can subdivide the whole surface into 
portions like the above and thus find that the same integral still gives 
the area A of the whole surface: 


(30a) 


A = ||, VEN = Wao” F Wako = Lao)” F Gufo — Gude)? du dv. 


The expression for the area of a surface in parametric represen- 
tation can be put in another noteworthy form if we make use of the 
coefficients of the line element (cf. Chapter 3, p. 283) 


ds? = E du? + 2F du dv + G dv?, 
that is, of the expressions 
E = $v? + Wu? + Xu’, 
F = upv + Yuyo + XuXv, 
G = go? + Y? + Xr’. 


A simple calculation shows that (see p. 284) 


(30b) EG — F? = (fuv — Yupo)? + (WuXo — Luyo)? + (Xudv — PuXv)?. 


Thus, for the area we obtain the expression 


(30c) A = || VEG- F du dv, 
and for the element of area 


(30d) do = VEG — F? du dv. 
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As an example, we again consider the area of a sphere with radius 
R, which we now represent parametrically by the equations 


x = Reosusinv, 
y = R sin u sin v, 
z = R cosv, 


where u and v range over the region0 < u < 2r andOxSuvu<z7.A 
simple calculation shows that here 


(30e) do = R? sin v du dv, 


which once more gives us the expression 
an T , 9 
R? o du | sin v dv = 4nR 


for the area. 

More generally, we can apply formula (30d) to the surface of revolu- 
tion formed by rotating the curve z = g(x) about the z-axis. If we refer 
the surface to polar coordinates (u, v) in the x, y-plane as parameters, 
we obtain 


x = U COSV, y = usin, z = PV x? + y?) = (u). 
Then, 
E = 1 + ¢(w), F= 0, G = uê, 


and the area is given in the form 
2n Uy u — m 
(31a) Í, du f ; uvl + ¢%(u) du = 2r f i uv1 + ¢*(u) du. 


If instead of u we introduce the length of arc s of the meridian curve 
z = ġ(u) as parameter, we obtain the area of the surface of revolution 
in the form 


(31b) 2r Í k u ds, 


where u is the distance from the axis of the point on the rotating curve 
corresponding to s (Guildin’s rule; cf. Volume I, p. 374). 
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We apply this rule to calculate the surface area of the torus (cf. 
Chapter 3, p. 286) obtained by rotating the circle (x — a)? + z2 = r? 
about the z-axis. If we introduce the length of arc s of the circle as a 
parameter, we have u = a + r cos (s/r), and the area is therefore 


Or 
2n f u ds = an f" fa + r cos *)ds = 2na - 2nr. 


The area of a torus is therefore equal to the product of the circumfer- 
ence of the generating circle and the length of the path described by 
the center of the circle. 


Exercises 4.8 
1. Calculate the volume of the solid defined by 


2 2 2 
Wty — YP tea esl (a <1). 
2. Find the volume cut off from the paraboloid (x?/a?) + (y2/b?) = z by the 

plane z = h. 

3. Find the volume cut off from the ellipsoid (x?/a?) + (y2/b?) + (z?/c?) = 1 
by the plane lx + my + nz = p. 

4. (a) Show that if any closed curve 80 = f(¢) is drawn on the surface r? = 
a? cos 20 (r, 9, é being spherical coordinates in space), the area of the 
surface so enclosed is equal to the area enclosed by the projection of 
the curve on the sphere r = a, the origin of coordinates being the 
vertex of projection. 

(b) Express the area by a simple integral. 
(c) Find the area of the whole surface. 

5. Find the volume and surface area of the solid generated by rotating the 
triangle ABC about the side AB. 

6. Find the surface area of the paraboloid z = x? + y? intercepted between 
the cylinders x? + y? = a and x? + y2 = b, where a = } [(2m — 1)? — 1] 
and b = 4 [(2n — 1)? + 1], m and n being natural numbers with n > m. 

7. Find the surface area of the section cut out of the cylinder x? + z? = a? 
by the cylinder x? + y? = b?, where 0 < b Sa and z 2 0. 

8. Show that the area = of the right conoid 


x = r cos 9, y =rsin9, z = f (8), 


included between two planes through the axis of z and the cylinder 
with generating lines parallel to this axis and cross section r = Ff’), 
and the area of its orthogonal projection on z = 0 are in the ratio 


[V2 + log (1 + ¥2)]}:1. 
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4.9 Physical Applications 


In Section 4.4 (p. 386) we have already seen how the concept of 
mass is connected with that of a multiple integral. Here we shall study 
some of the other concepts of mechanics. We begin with a detailed 
study of moment and of moment of inertia. 


a. Moments and Center of Mass 


The moment with respect to the x,y-plane of a particle with mass m 
is defined as the product mz of the mass and the z-coordinate. Similarly, 
the moment with respect to the y,z-plane is mx and that with respect 
to the z,x-plane is my. The moments of several particles combine 
additively; that is, the three moments of a system of particles with 
masses Mı, M2, . . ., mn and coordinates (x1, Yı, 21), . . ., (Xn, Yn, Zn) 
are given by the expressions 


(32a) Tz = x Mvxv, Ty = = mvyv, Tz = $ MvZv. 


y=1 


If we deal with a mass distributed with continuous density p = 
u(x, y, z) through a region in space or over a surface or curve, we 
define the moment of the mass-distribution by a limiting process, as 
in Volume I (p. 373) and thus express the moments by integrals. For 
example, given a distribution in space we subdivide the region R 
into n subregions, imagine the total mass of each subregion concen- 
trated at any one of its points, and then form the moment of the system 
of these n particles. We see at once that as n > œ and the greatest 
diameter of the subregions tends at the same time to zero, the sums 
tend to the limits 


(32b) Tz = [f| uxdxdydz, Ty = {fl wy dx dy dz, 


T: = I uz dx dy dz, 


which we call the moments of the volume-distribution. 

Similarly, if the mass is distributed over a surface S given by the 
equations x = ¢(u, v), y = y(u, v), z = x(u, v) with density (u, v), we 
define the moments of the surface distribution by the expressions 


Tz = || uzdo = ||; ux /EG— Fe du dv, 
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(320) Ty = |] uy do = |f py VEGF? du dv, 
T: = |) uzdo = |f uz VEG = F? du dv. 


Finally, the moments of a curve x(s), y(s), z(s) in space with mass 
density (s) are defined by the expressions 


81 $1 81 
(32d) Tz = Í ; ux ds, Ty = f i uy ds, T: = Í ; uz ds, 


where s denotes the length of arc. 
The center of mass of a mass of total amount M distributed through 
a region R is defined as the point with coordinates 


(32e) =m =y Fm: 


For a distribution in space, the coordinates of the center of mass are 
therefore given by the expressions 


= Jl), = dx dy az, . a; where M = ||| u dx dy dz. 


If the mass-distribution is homogeneous, u(x, y, 2) = constant, the 
center of mass of the region is called its centroid. 

As our first example, we consider the homogeneous hemispherical 
region H with mass density 1: 


e+yt+ 21, 
z= 0. 


Tr = [|J x dx dy dz, 


Ty = J|] y dx dy dz 


The two moments 


are 0, since the respective integrations with respect to x or y give the 
value 0. For the third, 


1The centroid is clearly independent of the choice of the constant positive value of 
the mass density. Thus, it may be thought of as a geometrical concept associated 
only with the shape of the region R, not dependent on the mass-distribution. 
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T; = ie z dx dy dz, 


we introduce cylindrical coordinates (r, z, 8) by means of the equa- 
tions 


Z2=2, x = r cos 9, y=rsin9@ 
and obtain 
1 V{—-2z2 2n 1] — 22 
T= zdz | rar Í d0 = on f 9 z dz 
0 0 0 0 
z? 24\\1 rz 
=a rt 


Since the total mass is 27/3, the coordinates of the center of mass are 
x = 0, y = 0, z = 3/8. 

Next, we calculate the center of mass of a hemispherical surface 
of unit radius over which a mass of unit density is uniformly dis- 
tributed. For the parametric representation 


= cos u sin v, y = sin u sin v, z = cosv 


we calculate the surface element from formula (30e) on p. 429 and find 
that 


(32g) do = VEG — F? du dv = sin v du dv. 


Accordingly, we obtain 
n/2 2n 
— in? — 
Tz = Í, sin’v dv | cos u du = 0, 


ni2 2r 
Ty = Í, sin2v du Í, sin u du = 0, 


n/2 2r n2, | n/2 
Tz = J, sin v cos v dv | du = 2r S ; = 
for the three moments. Since the total mass is obviously 2r, we see that 
the center of mass lies at the point with coordinates x = 0, y = 0, 
z= +. 


b. Moment of Inertia 


The generalization of the concept of moment of inertia to a con- 
tinuous mass-distribution is equally obvious. The moment of inertia 
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of a particle with respect to the x-axis is the product of its mass and of 
p? = y? + 22, that is, of the square of the distance of the point from 
the x-axis. In the same way, we define the moment of inertia about the 
x-axis of a mass distributed with density p(x, y, z) through a region 
R by the expression 


(33a) I u(y? + 22) dx dy dz. 
The moments of inertia about the other axes are represented by 


similar expressions. Occasionally, the moment of inertia with respect 
to a point, say the origin, is defined by the expression 


(33b) [|J wet + 3? + 2) dx dy dz, 


and the moment of inertia with respect to a plane, say the y, z-plane, 
by 


(33c) I ux? dx dy dz. 


Similarly, the moment of inertia, with respect to the x-axis, of a sur- 
face distribution is given by 


(33d) JJ nO? + 2%) do, 


where (u, v) is a continuous function of two parameters u and v. 
The moment of inertia of a mass distributed with density p(x, y, z) 

through a region R, with respect to an axis parallel to the x-axis and 

passing through the point (&, n, 6), is given by the expression 


— n}? _ ïy 
(33e) Í ji J, u(y — n)? + (z — 5)°] dx dy dz. 
If in particular we let (&, n, ¢) be the center of mass and recall the 


relations (32e) for the coordinates of the center of mass, we at once 
obtain the equation 


(3st) [f wo? + zdz dy dz = [lf uly - n? + @ - Ode dy dz 
+ m? +0) Sfp. de dy dz. 
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Since any arbitrary axis of rotation of a body can be chosen as the 
x-axis, the meaning of this equation can be expressed as follows: 


The moment of inertia of a rigid body with respect to an arbitrary 
axis of rotation is equal to the moment of inertia of the body about a 
parallel axis through its center of mass plus the product of the total mass 
and the square of the distance between the center of mass and the axis 
of rotation (Huygens’s theorem). 

The physical meaning of the moment of inertia for regions in 
several dimensions is exactly the same as that already stated in 
Volume I, p. 375: 


The kinetic energy of a body rotating uniformly about an axis is equal 
to half the product of the square of the angular velocity and the moment 
of inertia. 

We calculate the moment of inertia for some simple cases. 

For the sphere V with center at the origin, unit radius and unit 
density, we see by symmetry that the moment of inertia with respect 
to any axis through the origin is 


T= ||, G2 + 99 dx dy dz 
= J| G2 + 2) dx dy dz 
~ f o + 2) dx dy dz. 
If we add the three integrals, we obtain 
3I = [J] Ax + y? + 2%) dx dy dz. 


In spherical coordinates, 


2 Io [is f” 2 1 8r 
1=3 f rar » Svd R du =z * z * 2. 2n = ig’ 


For a beam with edges a, b, c parallel to the x-axis, the y-axis, and 
the z-axis, respectively, with unit density and center of mass at the 
origin, we find that the moment of inertia with respect to the x, y- 
plane is 


al2 b/2 12 3 
Í dx dy Í e! 2 dz = ab. 
—¢/2 


—aj2 _b/2 12 
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c. The Compound Pendulum 


The notion of moment of inertia finds an application in the mathe- 
matical treatment of the compound pendulum, that is, of a rigid body 
which oscillates about a fixed horizontal axis under the influence of 
gravity. 

We consider a plane through G, the center of mass of the rigid body, 
perpendicular to the axis of rotation; let this plane cut the axis in the 
point O (Fig. 4.16). The motion of the body is given as a function of time 


~~” 


Figure 4.16 


by the angle øg = g(t) that OG makes at time ¢ with the downward verti- 
cal line through O. In order to determine the function ¢ and also the 
period of oscillation of the pendulum, we assume a knowledge of 
certain physical facts (see p. 658). We make use of the law of con- 
servation of energy, which states that during the motion of the body 
the sum of its kinetic and potential energies remainsconstant. Here V, 
the potential energy of the body, is the product Mgh, where M is the 
total mass, g the gravitational acceleration, and h the height of the 
center of mass above an arbitrary horizontal line (e.g., above the 
horizontal line through the lowest position reached by the center of 
mass during the motion). If we denote by OG, the distance of the center 
of mass from the axis, by s, then V = Mgs (1 — cos ø). By p. 435 the 
kinetic energy is given by T = 4 I¢?, where J is the moment of inertia 
of the body with respect to the axis of rotation and we have written 
¢ for dø/dt. The law of conservation of energy therefore gives the 
equation 


(34a) A — Mgs cos ¢ = constant 


Multiple Integrals 4387 


If we introduce the constant l = I/Ms, this is exactly the same as the 
equation previously found! (Volume I, pp. 408, 410) for the simple 
pendulum; lis accordingly known as the length of the equivalent 
simple pendulum. 

We can now apply the formulas obtained for the simple pendulum 
(Volume I, p. 410) directly. The period of oscillation is given by the 


formula 
ə |L ___ de 
T= 2/52), /cos¢ — cos go’ 


where go corresponds to the greatest displacement of the center of 
mass; for small angles this is approximately 


_or [tL Jae 
T= 2r jÈ = 2n Mas’ 


The formula for the simple pendulum is of course included in this as 
a special case, for if the whole mass M is concentrated at the center 
of mass, then J = Ms?, so that l = s. 

Investigating further, we recall that J, the moment of inertia about 
the axis of rotation, is connected with Jo, the moment of inertia about 
a parallel axis through the center of mass, by the relation (cf. 33f) 


I = Io + Me. 


Hence, 


We see at once that in a compound pendulum / always exceeds s, 
so that the period of a compound pendulum is always greater than 


1Jn the notation used here the motion of the point mass in the simple pendulum is 
described by x = lsin ¢, y = — l cos ¢gand its speed by l.ġ. Here ¢, by Volume I, 
p. 408, satisfies the differential equation 


: (1ġ)2 — gl cos ¢ = constant. 
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that of the simple pendulum obtained by concentrating the mass M 
at the center of mass. Moreover, the period is the same for all parallel 
axes at the same distance s from the center of mass, for the length of 
the equivalent simple pendulum depends only on the two quantities s 
and a = Io/M and therefore remains the same, provided neither the 
direction of the axis of rotation nor its distance from the center of 
mass is altered. 

The formula 


T = On js + als 
g 


shows that the period T increases beyond all bounds as s tends to 0 
or to infinity. It must therefore have a minimum for some value so. 
By differentiating we obtain 


Io 
TEN 


A pendulum whose axis is at a distance so = ~vTo/M from the center of 
mass will be relatively insensitive to small displacements of the axis, 
for in this case dT/ds vanishes, so that first-order changes in s produce 
only second-order changes in T. This fact has been applied by Profes- 
sor M. Schuler of Göttingen in the construction of very accurate 
clocks. 


d. Potential of Attracting Masses 


We have seen in Chapter 2 (p. 208) that Newton’s law of gravitation 
gives the force that a fixed particle Q with coordinates (&, n, 6) and 
mass m exerts on a second particle P with coordinates (x, y, z) and 
unit mass, apart from the gravitational constant y, as 


1 
m grad P? 


where 
r= Ja -PFO FE- 


is the distance between the points P and Q. The direction of the force 
is along the line joining the two particles, and its magnitude is 
inversely proportional to the square of the distance. Here the gradient 
of a function f(x, y, z) is the vector with components 
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af af af 


Ox’ dy > oz’ 
Hence, in our case the force has the components 


mÉ- x) mn-y) m-2 
re? re? re 


If we now consider the force exerted on P by a number of points Qu, 
Q2, . . ., Qn with respective masses mı, M2, . . ., Mn, We Can express 
the total force as the gradient of the quantity 


where ry denotes the distance of the point Q, from the point P. Ifa 
force can be expressed as a gradient of a function, it is customary to 
call this function the potential of the force;! we accordingly define the 
gravitational potential of the system of particles Qi, Q2, . . ., Qn at the 
point P as the expression 


n my 
2, V(x — Ev)? + (y — Nv)? + (z — Cv)? ; 


We now suppose that instead of being concentrated at a finite 
number of points the gravitating masses are distributed with con- 
tinuous density over a portion R of space or a surface S or a curve 
C. Then the potential of this mass-distribution at a point with co- 
ordinates (x, y, 2) outside the system of masses is defined as 


(35a) [ff eae an at, 
or 

(35b) le 

or 

(35c) J $ É ds. 


1Often the negative of this function, which has the meaning of potential energy, is 
called the potential of the forces. 
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In the first case, the integration is taken throughout the region R 
with rectangular coordinates (E, n, 6); in the second case, over the 
surface S with the element of surface do; and in the third case, along 
the curve with length of arc s. In all three formulae, r denotes the 
distance of the point P from the point (&, n, C) of the region of inte- 
gration and u the mass density at the point (&, n, 6). In each case the 
force of attraction is found by forming the first derivatives of the 
potential with respect to x, y, z. Working with the potential rather 
than with the force has the advantage that only one integral instead 
of three has to be evaluated. The three force components are then 
obtained as derivatives of the potential. 

For example, the potential at the point P with coordinates (x, y, z) 
due to a sphere K with uniform density 1, with unit radius and with 
center at the origin, is the integral 


f f f d& dn dý 

V(x — E} + (y — n) + (z — 0)? 

1 + + —F2-— 
=f" f / I—E2 2 dn { /1-E2 "1 ae. 
V1-§2 — V1-§2-n2 r 

In all the expressions (35a, b, c) the coordinates (x, y, 2) of the point 
P appear not as variables of integration but as parameters, and the 
potentials are functions of these parameters. 

To obtain the components of the force from the potential we have 
to differentiate the integral with respect to the parameters. The rules 
for differentiation with respect to a parameter extend directly to 
multiple integrals, and by p. 74, the differentiation can be performed 
under the integral sign, provided that the point P does not belong to 
the region of integration, that is, provided that we are certain that 
there is no point of the closed region of integration for which the dis- 
tance r has the value 0. Thus, for example, we find that the components 


of the gravitational force on a unit mass due to a mass distributed with 
unit density through a region R in space are given by the expressions 


- {ff eta an at, 
- fff rade an ag, 
-- fff F sans 


(36) Fy 


a 
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Finally, we point out that the expressions for the potential and its 
first derivatives continue to have a meaning if the point P lies in the 
interior of the region of integration. The integrals are then improper 
integrals, and as is easily shown, their convergence follows from the 
criteria of Section 4.7 

As an illustration, we calculate the potential at an internal point 
and at an external point due to a spherical surface S with radius a 
and unit density. If we take the center of the sphere as the origin and 
let the x-axis pass through the point P (inside or outside the sphere), 
the point P will have the coordinates (x, 0, 0), and the potential will be 


U= leprae 


If we introduce spherical coordinates on the sphere through the 
equations 

E = a cos O, 

n = a sin 6 cos ¢, 


C = a sin 9 sin ø, 


then [see (30e), p. 429] 


7 a? sin 0 Qn 
U= |" essere © |, 
T Qar 0 
= On a° sin 


o Vx + a? — 2ax cos 0 do. 


We put x? + a? — 2ax cos 0 = r?, so that ax sin 0 dð = r dr, and 
(provided that x + 0) the integral then becomes 


—_—or 


ee pe rdr _ 2na 
=~ = 


rer = "(|x +a) — |x al). 


lz-al T 


For |x| > a we therefore have 


_ 4na? 


and for |x|< a, 
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Hence, the potential at an external point is the same as if the whole 
mass 4ra? were concentrated at the center of the sphere. On the other 
hand, throughout the interior the potential is constant. At the surface 
of the sphere the potential is continuous; the expression for U is still 
defined (as an improper integral) and has the value 4ra. The com- 
ponent of force Fz in the x-direction, however, has a jump of amount 
— 4r at the surface of the sphere, for if |x|> a, we have 


while Fz = 0 if |x| <a. 

The potential of a solid sphere of unit density is found from that 
of a spherical surface by integrating with respect to a. This gives the 
value 


Anas 
3|x| 


for the potential at an external point. This again is the same as if the 
total mass (4/3)na were concentrated at the center. By differentiation 
with respect to x we find for a point on the positive x-axis that 


Ana? 
F g= y2 e 
This is Newton’s result that the attraction exerted by a solid sphere 
of constant density on an external point is the same as if the mass of 
the sphere were concentrated at its center (Volume I, p. 413). 


Exercises 4.9 


1. (a) Find the position of the centroid of a solid right circular cone. 
(b) What is the position of the centroid of the curved surface of the cone? 
2. Find the position ofthe centroid of the portion of the paraboloid z2 + y? 
= px cut off by the plane x = xo, where xo < 0. 
3. Find the centroid of the tetrahedron bounded by the three coordinate 
planes and the plane x/a + y/b + zlc = 1. 
4. (a) Find the centroid of the hemispherical shell a? < x? + y? + z? < B?, 
z= 0. 
(b) Show that the centroid of the hemispherical lamina x? + y? + 2? 
= a? is the limiting position of the centroid in part (a) as b ap- 
proaches a. 


10. 


11. 


12. 


13. 


14. 


Multiple Integrals 448 


. Find the moment of inertia about the z-axis of the homogeneous 


rectangular parallelopiped of mass m withO <x <a,0 <y <b,0 <z 
<c. 


. Calculate the moment of inertia of the homogeneous solid enclosed 


between the two cylinders 
x2 +y = R and xL +y = R’ (R> R’) 
and the two planes z = h and z = —hħh, with respect to 


(a) the z-axis, 
(b) the x-axis. 


. Find the mass and moment of inertia about a diameter of a sphere whose 


density decreases linearly with distance from the center from a value 
uo at the center to the value yı, at the surface. 


. Find the moment of inertia of the ellipsoid x?/a? + y?/b? + z?/c? < 1 with 


respect to 
(a) the z-axis, 
(b) an arbitrary axis through the origin, given by 


x:y:z = a:ßB:yY (x2 + B2 + y2 = 1) 


. If A, B, C denote the moments of inertia of an arbitrary solid of positive 


density with respect to the x-, y-, and z-axis, then the “triangle inequal- 
ities” 


A+B>C, A+C>B, B+C>A 


are satisfied. 

Let O be an arbitrary point and S an arbitrary body. On every ray from 
O we take the point at the distance 1/VI from O, where I denotes the 
moment of inertia of S with respect to the straight line coinciding with 
the ray. Prove that the points so constructed form an ellipsoid (the so- 
called momental ellipsoid). 

Find the momental ellipsoid of the ellipsoid x?/a? + y?/b? + 22/c? <1 
at the point (č, n, ©). 

Find the coordinates of the center of mass of the surface of the sphere 
x2 + y2 + 22 = 1, the density being given by 


1 
Ve = 1 + y+ 2? 
Find the x-coordinate of the center of mass of the octant of the ellipsoid 
x?/a? + y?/b? + 27/c2? <1 (x20, y 2 0, z 2 0). 


A system of masses S consists of two parts Sı and Se; hh, Iz, I are the 
respective moments of inertia of Sı, S2, S about three parallel axes 
passing through the respective centers of mass. Prove that 


mime 


f=hA+h+ F m 


d?, 
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15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


where mı and mz are the masses of Sı and Sz and d the distance between 
the axes passing through their centers of mass. 

Find the envelopes of the planes with respect to which the ellipsoid 
(x2/a?) + (y2/b?) + (22/c?) < 1 has the same moment of inertia A. 


Calculate the potential of the homogeneous ellipsoid of revolution 
——— + <1] (b> a) 


at its center. 
Calculate the potential of a solid of revolution 


r=vxi ty? S f(z) (a sz Sb) 


at the origin. 

Show that at sufficiently great distances the potential of a solid S is 
approximated by the potential of a particle of the same total mass 
located at its center of gravity with an error less than some constant 
divided by the square of the distance. 

Assuming that the earth is a sphere of radius R for which the density 
at a distance r from the center is of the form 


ọ = A — Br 


and the density at the surface is 2} times the density of water, while the 
mean density is 5} times that of water, show that the attraction at an 
internal point is equal to 


1 r r? 
11E R (20 —9 a): 
where g is the value of gravity at the surface. 


A hemisphere of radius a and of uniform density p is placed with its 
center at the origin, so as to lie entirely on the positive side of the x, y- 
plane. Show that its potential at the point (0, 0, z) is 


Sele 2 4 23)? — a? +5 ate | — Í rpg? if 0<z<a 
and 


Let (x1, y1), (x2, Y2), (x3, ys) be the vertices of a triangle of area A (the 
order of the suffixes giving the positive orientation). Prove that the 
moment of inertia of the triangle with respect to the x-axis is given by 


A + yo? + ya? + y1y2 + yays + yay). 


Prove that the attraction at either pole of a uniform spheroid with 
density p and semiaxes a, a, c is equal to 
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2ro J, 2e r(1 — cos 9) dr, 


where 
r = 2a’c cos 6/(a? cos? 8 + c? sin? 8). 


23. It is known experimentally that a charged conducting spherical lamina 
(on such a surface the charge distributes itself uniformly) exerts zero 
force on a point charge inside the sphere. Assuming that point charges 
repel or attract each other with a force dependent only on the distance 
between them, prove that thiś experiment implies Coulomb’s law— 
namely, that point charges attract or repel each other with a force 
proportional to the inverse square of their separation. This result is the 
converse of the theorem that the force of gravity of a homogeneous 
spherical lamina vanishes in its interior. 


4.10 Multiple Integrals in Curvilinear Coordinates 


a. Resolution of Multiple Integrals 


If the region È of the x, y-plane is covered by a family of curves 
g(x, y) = constant, so that each point of R lies on one, and only one, 
curve of the family, we can take the quantity g(x, y) = 6 as a new 
independent variable; that is, we can take the curves C represented 
by ¢(x, y) = constant = € as one of the two families of curves in a 
coordinate grid. 

For the second independent variable we can choose the quantity 
n = y, provided that we restrict ourselves to a region R in which each 
pair of curves ¢(x, y) = constant and y = constant intersect in one 
point. 

If we introduce these new variables, a double integral ffr f(x, y) dx dy 
is transformed as follows [cf. (16b), p. 403]: 


[[ fem dedy = [f AEP as an, 


Keeping € constant and integrating the right-hand side with respect 
to n, the integral with respect to n can be written in the form 


_ f(x,y) y) Výr? + dy? + dy” dy. 
Vør? + by? Igel 


Since on Cz 
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this integral may be regarded as an integral along the curve d(x, y) = 
E, the length of arc s being the variable of integration. Thus, we 
obtain the resolution 


(37a) [f Ren dedy = fae [mad 


for our double integral. 

The intuitive meaning of this resolution is very easily recognized 
if we suppose that corresponding to the curves C; there is a family of 
orthogonal curves (the so-called orthogonal trajectories) that intersect 
each separate curve ¢ = constant = € at right angles, in the direction 
of the vector grad ¢. If o is the length of arc on an orthogonal curve 
represented by the functions x(o) and y(0), then 


oe _ be dy = oy 
— Vir + dy? + by” i do Výr? + dy? + dy? 


Since 
B= oF + by 
we obtain 
(37) Go = pF toy = Vigrad FF. 


Figure 4.17 
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We now consider the mesh bounded by two curves g(x, y) = &, g(x, y) 
= € + AE, and two orthogonal curves that cut off a portion of 
length As from ¢(x, y) = € (Fig. 4.17). The area of this mesh is given 
approximately by the product As Ao, and this in turn is approximate- 
ly equal to 


As AE 
Vore + by?’ 


This leads to a new interpretation of the identity (37a); 


Instead of calculating a double integral by subdividing the region 
into “infinitesimal rectangles” with sides parallel to the coordinate 
axes, we may use the subdivision into infinitesimal curvilinear rectan- 
gles determined by the curves ¢(x, y) = constant and their orthogonal 
trajectories. 

A similar resolution can be effected in three-dimensional space. If 
the region F is covered by a family of surfaces Sg given by an equation 
d(x, y, z) = constant = € in such a way that through every point 
there passes one, and only one, surface, then we can take the quantity 
& = g(x, y, z) as a variable of integration. In this way we resolve a 
triple integral 


Wr f(x, y, z) dx dy dz 


= {dg || fee et at ae 


Føt tj pd EF 


into an integral 


xyz 
Da We 3 T Výr? + py? + 62” aS 


over the surface ¢ = & with element of area 


dS = 
[gz] 


dy dz 


[see (29d), p. 426] and a subsequent integration with respect to £: 


(37c) ill (fx, y, z) dxdydz = | d I. wes 
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This formula again permits a geometric interpretation if we in- 
troduce the two-parametric family of curves orthogonal at each point 
to a surface € = constant and use, in addition to the S:, coordinate 
surfaces consisting of those curves. 


b. Application to Areas Swept Out by Moving Curves and Volumes 
Swept Out by Moving Surfaces. Guldin’s Formula. The Polar 
Planimeter 


The quantity 


do_ __i 
dE Vga? + gy’ 


appearing in formulae (87a, b) can be interpreted kinematically if we 
identify the parameter € with the time t. The equation g(x, y) = 
constant = t represents then the position C; of a moving curve at the 
time t. The quantity Ac, which measures distances along the curves 
orthogonal to the curves C;, can be thought of as the normal distance 
between the curves C; and Ciia:. Accordingly, 


do 1 
(38a) C= Ot Vb + be 
is the normal velocity of the moving curve C; at the time t. This veloc- 
ity is different at different points of C;. Similarly, the normal velocity 
of the moving surface S: in space with equation g(x, y, z) = constant 
= tis 


1 
(38b) c = Vbxt + Oy? + 62 e 
In physics, such moving surfaces occur as wave fronts (e.g. for electro- 
magnetic waves propagating in a medium). 

The normal velocity c of a moving surface S; (and similarly of a 
moving curve C; in the plane) has a particularly simple meaning if 
S: consists of individual moving particles. If the position of one of 
these particles is described by the three functions x = x(t), y = y(t), 
z = z(t) and if the particle at all times stays on the moving surface, 
the equation 


9(x(t), Y(t), 2(¢)) = t 
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must hold for all ¢. Differentiating with respect to ¢ we find the 
equation 


= fe 5 +o + ae 


If we divide this equation by the absolute gradient of ¢ we obtain 
the relation 


d dy „d 
(38c) c= letn 


where c is the normal velocity defined by (38b), &, n, 6 are the direction 
cosines of one of the normals of S:, and the positive or negative sign 
applies according to the normals pointing in the direction of increas- 
ing or decreasing t, respectively. If we introduce the unit-normal 
vector 


n = (6, n, 6) 


and the velocity vector of the particle 


s(a de> ae 


we can represent c by the scalar product 
(38d) c=tv-n 


In words, the component normal to the surface S: of the velocity of a 
particle moving with the surface equals + c where c is the normal 
velocity of S:. The positive sign holds when n is the “forward” normal 
of S;, that is, the normal on the side of the surface facing the points to 
be swept over in the immediate future. 

Formula (37c) for f = 1 yields an expression for the volume V of the 
region swept over by a moving surface S; with normal velocity c: 


(39a) V= ||| dardydz= f at Jf, c as. 


Similarly, we find for the area A of a region in the plane swept over by 
a moving curve C: the expression 


(39b) A = | dt fac ds- 
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We apply these results to the case of an area swept over by a 
straight line segment C; moving in the plane (Fig. 4.18). The segment 
can be represented by an equation of the form 


(40a) E(x + nity = pd), 


where (€, n) is the unit normal and p the (signed) distance of C: from 
the origin. The center of C: (which is the same as its centroid) is at the 
point [see (32e), p. 432] 


40b _ Jo xds _ Sy ds 
(40b) XW fads ’ Yo) feds ` 
Figure 4.18 


Integration of (40a) with respect to s over the segment C; furnishes the 
relation 


(40c) EX) + nl) YO) = p), 


which merely states that the center of C; lies on C;. If C: is thought to 
consist of individual moving particles the normal component of the 
velocity of these particles is found from (40a), (38c) to be 


_ „dx dy_dp d dy 
n:v=6 i tde dt dt” dt” 


Hence by (40b), (40c) 


+ f eds=f n-vds=(P-o - p Y) f., as 


= e tng) d =w n? 


1The same formula can also be derived using the expression (38a) for c if one calcu- 
lates the first derivatives of the function t = ¢(x, y) with respect to x and y from the 
implicit equation (40a) for the function t. 
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where 


ae ae 


is the velocity vector of the center (X, Y) of the segment C;, and 


L = It) = |, ds, 


the length of Cz. It follows from (39b) that the area swept over by the 
moving segment C; is 


(41a) A=[4Lw-ndt. 


In the same way, one finds that the volume swept out by a moving 
plane region S; of area A(t) and unit normal n is 


(41b) V= ft Aw - ndt, 


where w is the velocity of the centroid (X, Y, Z) of S:. In these formulas 
the positive sign is taken when n is the “forward normal” of S;, the 
one that points in the direction of motion. 

Of special interest is the case of formula (41b) in which the cen- 
troid (X, Y, Z) of S: moves along a curve which at every moment is 
perpendicular to the plane of S;. In that case, the normal component 
of velocity of the centroid coincides with the speed of motion of the 
centroid along its path: 


do 
dt ’ 


+wen= 


where o is the length of arc along the path of the centroid. It follows 
then that 


(42a) V= |AS dt= fA do. 


If, moreover, all the plane regions S; have the same area A, we find 
that 


(42b) V= A fdo, 
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or that the volume swept out by the S: is equal to their area A multiplied 

by the length of the path described by their centroids. A particular case is 

obviously Guldin’s rule for the volume of a solid of revolution swept out 

by rotation of a plane region R about an axisin that plane. The volume 

is equal tothe area A of R multiplied by the length of the path described 

by the centroid of R during the revolution (see Volume I, p. 374). 
Returning to formula (41a) we see that the integral 


(43a) f Lw-ndt 


represents the signed area swept out by the segments C;, the sign de- 
pending on whether the normal n points in the direction of motion or 
in the opposite one. The same holds for an integral 


(43b) | Í Aw. ndt 


associated with volumes swept out by a moving plane area. 

These observations allow us to extend our results to cases in which 
the segment or plane area does not always move in the same sense or 
covers part of the plane (or space) more than once. The integrals given 
above will then express the algebraic sum of the areas (or volumes) 
of the parts of the region described, each taken with the appropriate 
sign. 

As an example, let a segment of constant length move so as to have 
its end points always on two fixed curves I and I” in a plane, as in Fig. 
4.19. From the arrows showing the positive direction of the normal, 
we can determine the sign with which each area appears in the inte- 
gral, and we find that the integral gives the difference between the 
areas enclosed by IT and I’. If I’ contains zero area, as when it de- 


Figure 4.19 
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generates into a single segment of a curve multiply described, the in- 
tegral gives the area enclosed by I. 

This principle is used in the construction of the well-known polar 
planimeter (Amsler’s planimeter). This is a mechanical apparatus for 
measuring plane areas. It consists of a rigid rod at the center of which 
is a measuring wheel that can roll on the drawing-paper. The plane of 
the wheel is perpendicular to the rod. When the instrument is to be 
used to measure the area enclosed by a curve I drawn on the paper, 
one end of the rod is moved round the curve, while the other is hinged 
to a rigid arm whose other end pivots about a fixed point O, the pole, 
exterior to I. The hinged end of the rod therefore describes (multiply) 
an arc of a circle, that is, a closed curve containing zero area. It 
follows that here the expression (43a) furnishes the area enclosed by 
I. But the integrand Lw-n is proportional to the angular speed with 
which the measuring wheel turns, provided that the circumference of 
the wheel moves on the paper as the rod moves, in which case the 
position of the wheel is only affected by the motion normal to the rod. 
The total angle by which the wheel has turned is then proportional 
to the area enclosed by I. 

In the instrument as usually constructed the wheel is not exactly 
at the center of the rod, but this only alters the factor of proportion- 
ality in the result, and the factor can be determined directly by a 
calibration of the instrument. 


4.11 Volumes and Surface Areas in Any Number of Dimensions 


a. Surface Areas and Surface Integrals in More than Three 
Dimensions 


In n-dimensional space described by n coordinates x1,..., Xn an 
(n — 1)-dimensional surface (hypersurface or manifold) is defined by 
an implicit equation 


(44a) @(X1, X2,. . ., Xn) = constant, 


where at each point of the surface at least one of the first derivatives 
of ¢ does not vanish. We suppose that a portion S of this surface 
corresponds to a certain region B in x1x2 + + *xn-1-space where 0@¢/0xn 
+ 0 and xncan be calculated from equation (44a) as a function of the 
other coordinates. 

We now define the (n — 1)-measure of this portion of surface as the 
integral 
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_ Vr? + Garg + + + + + gant 
(44b) A = ff Í ‘ben dxı dx2 dXn-1. 


This definition is a formal generalization of formula (29b), p. 425 for 
areas of surfaces in three-space and can be based on similar intuitive 
arguments. When there is no danger of confusion, we shall also refer to 
A simply as “area” even in the case of a hypersurface in n-dimensional 
space. A more systematic discussion of surfaces, surface areas, and 
surface integrals will be given in the next chapter. For the moment, 
we observe only that the quantity A defined by (44b) is independent 
of the choice of the coordinate xn for which we solve equation (44a). 
This may be proved in the same way as was done in the three-dimen- 
sional case on p. 426. 


More generally, we define the integral of a function f(x1, . . ., xn) 
over this (n — 1)-dimensional surface as 
(44c) SJ coe f fn. <., Xn) do 
2 e.o 2 
=f oe | fxs. . an) TP ttt + Ben” de dxe © e e dXn-1, 
B ldap | 


where, as before, we suppose that xn is expressed in terms of x1,.. ., 
Xn-1 by means of equation (44a). We again find that the value of the 
expression (44c) is independent of the choice of the variable xn. 

As for two or three dimensions, a multiple volume integral over an 
n-dimensional region R 


(45a) J. e o f f(x1, . . ., Xn) dxi, . . ., dXn 


can be resolved into surface integrals [see formulas (87a, c)]. We 
assume that the region R is covered by a family of hypersurfaces S: 


(45b) b(x1,.. ., Xn) = constant = € 
in such a way that through each point of R there passes one, and only 
one, surface. If we replace x1,..., Xn-1, Xn by new independent 
variables 

Xl, - > +, Xn-1, & = (x1, . . ., Xn), 


the multiple integral (45a) becomes by the rule for transformation of 
integrals (p. 404) 
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fae fo fi" an + + dens, 


Using formula (44c), we obtain the formula 


(45c) Sf- f teas... nda ++ + dæn 


= fa f e 


where 


Vr? +eee ft Orne dxı 


ban © o e dXn-1 


(45d) do = 
is the element of area of the surface Ss. 


b. Area and Volume of the n-Dimensional Sphere 


As an application of the formula (45c) for reduction of volume to 
surface integrals, we shall calculate the area and volume of a sphere 
of radius R in n-dimensional space, that is, the area of the hyper- 
surface with equation 


(46a) x12 fees + Xn? = R, 
and the volume of the ball 
(46b) x12 + eee + Xn? < R. 


We first derive a general formula that reduces the space integral 
of a function with spherical symmetry to a single integral. We say the 


function f of the variables xı, . . ., xn has spherical symmetry if 
f= fír), 

where 

(46c) r= yx F +++ + Xn, 


that is, if f is constant on spheres with centers at the origin. The 
sphere S; of radius r about the origin is given by the equation 


(46d) (xi, . . «> Xn) = VZ + + + © + Xn? = constant = r. 
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Here 


1 


ri Voz? +eee t+ Dan” =1. 


(46e) zi = 


From (45c) we then obtain the volume integral of the function f(r) 
over the ball (46b), namely, 


(46f) J++ [A dæ- - + dæn= f far f, +- + fdo 
= f f(r) Qa(r)dr, 


where 2,(r) is the area of the sphere Sr. Here, by (44b), (46e) the area 
of the hemisphere 


p =N tees + xX =r (xn = 0) 
is 
1 _ dx, + + + AXn-1 
(47a) phar fe f m , 


where the integration is extended over the (n — 1)-dimensional ball 
B; given by 


x1? + eee + Xn-1? S r?, 


and where 
Xn = Vr? — x? — e» e e — Xn-1?. 
Replacing xı, . . ., Xn-1 in Br by the new variables 
1 . 
Gt = mt ((@=1,...,n—1) 
and putting 
1 o o 
Gn = 7 Xn = V1 — k? — e e o — Eni? 


we obtain from (47a) that 


(47b) Qn(r) = 2r”-1 J a f déi z dEn-1 l 
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where the integration is over the unit ball in n — 1 dimensions 
E1? + eee + En < 1. 
Formula (47b) can be written as 
(47c) Or) = On rn-l 


where 


on =2ff--- [SES = ona) 


is the area of the unit sphere Si in n dimensions. It expresses the 
intuitively plausible fact that areas of spheres in n dimensions are 
proportional to the (n — 1)-st power of their radius. Formula (46f) for 
the space integral over the ball (46b) of a function with spherical 
symmetry now takes the form 


(48a) ff coe fio dxı - » - d£n = On f ý f(r)r”-} dr. 


We can calculate ©@n conveniently from this formula. We choose for 
f(r) a function for which the integral on the right converges absolutely 
for R —> œ and can be evaluated explicitly. The improper integral of 
f(r) as a function of xı, . . ., xn over the whole space then also con- 
verges. We choose for f the function? 


f(r) = exp(—r?) = exp(— x1? — o o o — Xn?). 
The integral of f over the whole space is the limit of integrals over cubes 


Ca with center at the origin and sides of length 2a parallel to the axes. 
Here 


Uo. . . [ Ar) ds- . o dxn 


= Ii dxı i dx.» in dxn exp(— x1") exp(— x2?) + + + exp(— Xn?) 


- fi en dz)”. 


1One conveniently writes exp(z) for the exponential function ezin cases where the ex- 
ponent z is a more complicated expression. 
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Thus, for a > œ, we obtain from (48a) the identity 
(48b) | f TT ea? dx)” = On f ” e-r2pn-1 dr. 


For the special case n = 2, this formula already has been derived by 
a similar argument on p. 415 and led to the result [see (25a)] that 


(480) r (i) = [0 ee dx = ve. 
On the other hand, the substitution r? = s shows that 


° 1 p°” 1. (n 
-r2 pn-l dp = = -se(n-2)/2 de — +r |”? 
(48d) Í, er2 pn-l dr z, e-sg(n ds = 51 (3): 


Here I (u) denotes the gamma function defined by 
F(u) = | es- ds (u > 0) 
in Volume I (p. 308). Hence, (48b) leads to the value 
on = NE 
r() 


for the surface area of the unit sphere in n dimensions. The value of 
T'(n/2) for integers n is easily determined from the recursion formula 


(48f) (pu) = (u — IT (ue — 1), 


which follows directly by integration by parts from the definition 
of the gamma function (see Volume I, p. 308). Hence, for even n 


(48e) 


(48g) rij- n—4 


2 n 
eee = — — | 
— BS ST $ 1)! 
while for odd n, using (48c), 


(48h) T (a) = 


3 _ (n= An —4) +++ Be 


n-2n-—4 1 
Tl 9(n-1)/2 yT. 


5 z TG 
In this way we obtain from (48e) successively the values 


1See also pp. 497 of the present volume. 
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8 
z = 2r, @3 = 4r, Ws = 272, O5 = gT. 


In order to find the volume of the n-dimensional ball Vn(R) of radius 
R, we put f = 1 in formula (48a) and find that 


(49a) Vn(R) = {J cee Í dxı - - - dXn = On f n r^i dr = unR®, 


where 
1 ynn 
r 


is the volume of the n-dimensional unit ball. Thus, 


— _ _4 -l p8 72 
(49c) Vi = 2, V2 = T, V3 = gT, V4 = OM, Vs = gT. 


c. Generalizations. Parametric Representations 


In n-dimensional space we can consider an r-dimensional set for 
any r < n and seek to define its area. For this purpose a parametric 
representation is advantageous. Let the r-dimensional set be given 
by the equations 


xı = ġı(uı, . . ., Ur) 


Xn = dn(u, <. +3 Ur), 


where the functions ø» possess continuous derivatives in a region B 

of the variables (u1, . . ., ur). As the variables u1, . . ., ur range over 

this region, the point (x1, . . ., xn) describes an r-dimensional surface. 
From the rectangular matrix (see p. 147), 


ðuı ðu ðu 


0x1 Ox2 | | OXn 


ðuz ðuz ` duz 


OUr OUr OUr 
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we now form all possible r-rowed determinants D;, where i = 1, 2, 


„k= (3), the first of which, for example, is the determinant 


ðuı ðu ðu 


duz due uz 


Dı 


OUr OUr Our 


The area of the r-dimensional surface is then given by the integral 
60a) f- f DEEDEE: F Ddu- -du ; k="). 


By means of the theorem on the transformation of multiple in- 
tegrals (p. 404) and simple calculations with determinants (which we 
shall omit here), we can prove that the area defined by this expression 
is not changed if we replace u1, . . ., Ur by other parameters. We see 
also that for r = 1 this reduces to the usual formula for the length 
of arc, and for r = 2 in a space of three dimensions it becomes formula 
(30a), p. 428 for the area. 

We prove formula (50a) when r = n — 1, where n is arbitrary; that 
is, we shall prove the following theorem: 


If a portion of an (n — 1)-dimensional hypersurface in n-dimensional 
space can be represented parametrically by the equations 


Xi = Y(U1, . . ., Un-1) (i=1,...,n), 


then its area is given by 
(560) A=f-- . | VDF -F Dè dur ++ + dum, 


where D; is the Jacobian of (n — 1) rows given by 


_ d(x1ı, . . ., Xi-1, Xi+l, . » +, Xn) 
d(u1,.. ., Un-1) 


D; 


_ du, e e o Un—1) 
— 1/ d(x1, . . ., Xt-1, Xi+l, >- « Xn) ` 
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Here, as always, we assume the existence and continuity of all the 
derivatives involved. 

Without loss of generality we may assume that ¢z,, # 0. Then, by 
(44b), A is given by 


A= foo + SEEI dey o © o dana 


We have only to show that 


l |grad ġ|dxı + + + dXn-1 = JE D;? duı + + * dun-1, 
[pen | i 
or 
d(uı . Un—1) een. 
2— d, A5 D2) AM +--+» Une) _ Pen 
| grad g| Parn (2 4 ) d(x, . <, Xn) > D:?. 
Now, from the properties of Jacobians, 
Di _ A(x1, . . ., Xt-1, Xi41, - - ., Xn)/d(ur, . . ., Un-1) 
Dn d(xı, . . ., Xn-1)/d(ui, . . ., Un-1) 
— d(xı, oe oy Mil, Xi+l, > «y Xn) 
d(xı, . . ., Xn-1) ` 
This last Jacobian corresponds to the introduction of (x1, . . ., Xi-1, 
Xi+1, . . ., Xn) instead of (xı, . . ., Xn-1)asindependent variables. Butas 


the partial derivatives =a are obtained from the equations 


ð . 
bona + bai = 0 (i =1,...,n—1), 


we have D;/Dn = + ¢z2;/¢z,. Hence, 


_ Íz? 
Da ban a? 


which proves the formula (50b) for A. 
It may be mentioned here that the expression >: D;? may be rep- 
resented as a determinant of (n — 1) rows, 
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(50c) W= H D2 =T(Xuy, - . ., Xup-1) 
Xu} ° Xu Xu; ° Xuz e o e Xu: ° Xun-ı 


("Gram determinant”; see p. 194), so that 
(50d) A= f- -+ | /Wdur- ++ dun. 


Here, the elements of the determinant are the inner products of the 
vectors 


= (9% ote) — (221 2n) 
Xu; = (a pra and Xuz = Jur’ ` © | Jur’ 
namely, the expressions 

— $ 9x Oxy 
(50e) Xu; Xuk = 2 Ou; ðUk ` 


Exercises 4.11 
1. Calculate the volume of the n-dimensional ellipsoid 
x1? Xn? 
m a a Sl 


2. Express the integral I of a function of xı, depending on xı alone, over the 
unit sphere x12 + » « e + Xn? = 1 in n-dimensional space, as a single 
integral. 

3. An n-simplex is the intersection in n-dimensional space of n + 1 half- 
spaces in general position; that is, any n of the bounding hyperplanes 
of the half-spaces meet in exactly one point, a vertex of the simplex: For 
example, a triangle in the plane or a tetrahedron in three-dimensional 
space. Find the volume of the n-simplex bounded by the hyperplanes 
xk 2 0 for k =1,2,..., nand 


xı 


X2 Xn 
— + e e e 4 — <l. 
ai t az + an S 


4.12 Improper Single Integrals as Functions of a Parameter 


a. Uniform Convergence. Continuous Dependence on the Parameter 


Improper integrals frequently appear as functions of a parameter. 
For example, the integral of the general power 
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1 1 
(51a) lesa 


is an improper integral for x in the interval —1 < x < 0. 

We have seen (p. 74) that an integral over a finite interval is 
continuous when regarded as a function of a parameter, provided that 
the integrand is continuous. In the case of an infinite interval, 
however, the situation is not so simple. Let us consider, for example, 
the integral 


(51b) F(x) = f j ma dy. 


According to whether x > 0 or x < 0, this is transformed by the sub- 
stitution xy = z into 


- sin Z -” sin Z ” sin Z 
{ —— dz or f dz= -f — dz. 
0 2 0 z 0 2 


The integral 


f "sinz dz 

0 z 

converges, as we have seen in Volume I (p. 310), and in fact has the 
value 1/2 (Volume I, p. 589). Thus, although the function (sin xy)/y, 
regarded as a function of x and y, is continuous everywhere and its 
integral converges for every value of x, the function F(x) is dis- 
continuous: 


for x>0 


(51b) f Sn WY dy = 


y for x=0 


for x<0. 


Nia © NIA 


In itself, this fact is not at all surprising, for it is analogous to the 
situation of nonuniform convergence for infinite series (Volume I, 
p. 533), and we must remember that the process of integration is a 
generalized summation. We can be sure that an infinite series of 
continuous functions represents a continuous function only if the con- 
vergence is uniform. Here, in the case of improper integrals depending 
on a parameter, we must again introduce the concept of uniform 
convergence. 
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We say that the integral 
(52a) F(x) = | Kæ, y)dy 


converges uniformly (in x) in the interval a < x < b, provided that the 
“remainder” of the integral can be made arbitrarily small simultane- 
ously for all values of x in the interval under consideration, or, more 
precisely, provided that for a given positive number é, there is a positive 
number A = A(e) that does not depend on x and is such that whenever 
BEA 


(52b) |, Kæ »)dy| < e. 
As a useful test we mention that the integral 
in f(x, y)dy 


converges uniformly (and absolutely) if for sufficiently large y, say y > 
yo, the relation 


(52c) f| < 


holds, where M is a positive constant and a > 1. For, in this case, 


7 "dy y— t — t 
S flx, dy) <M f ya =Ma- 1)B" SMa- 1)A™! ’ 


the last bound can be made as small as we please by choosing A 
sufficiently large, and it is independent of x. This is a straightforward 
analogue of the test for the uniform convergence of series given in 
Volume I (p. 535). 

We readily see that a uniformly convergent integral of a continuous 
function is itself a continuous function, for if we choose A so that 


S f(x, y)dy | <E 
for all values of x in the interval under consideration, then, from (52a), 


F(x + h) — FE <| f He + hy) — fe} dy| + 26, 
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By virtue of the uniform continuity of the function f(x, y) in a bounded 
set, we can choose h so small that the finite integral on the right is 
less than €, which proves the continuity of the integral. 

A similar result holds when the region of integration is finite, but 
the integrand has a point of infinite discontinuity. Suppose, for 
example, that the function f(x, y) tends to infinity as y ~ a. We then 
say that the convergent integral 


(53a) F(x) = | "flee, y)dy 


converges uniformly ina < x < b if for every positive number € we 
can find a number k independent of x such that 


(53b) S Rady <s, 


provided h < k. 
The condition in the neighborhood of the point y = a 


(53c) | fee y)| < oo (Vv <1) 


is sufficient for uniform convergence. As before, uniform convergence 
for a continuous integrand implies that the integral is a continuous 
function. 

If the convergence is uniform in an interval a < x < b, the im- 
proper integral F(x) is continuous. We can then integrate F(x) over 
this finite interval and thus form the corresponding improper re- 
peated integral 


f; dx [7 Ræ, y)dy 


for an infinite interval of integration in y, and 
b B 
f, dx f f(x, y)dy 


for an infinite discontinuity. 

Instead of the finite interval a < x < b, we can of course also 
consider an infinite interval of integration for x. But then the re- 
peated integral need not converge. For example, the integral 


_(°_dy _& 
Fix) = |) ea i = oe 
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converges uniformly for x = 1, but 
f i F(x)dx 
does not exist. 


b. Integration and Differentiation of Improper Integrals with 
Respect to a Parameter 


It is not true in general that improper integrals may be differenti- 
ated or integrated under the sign of integration with respect to a 
parameter. In other words, limit operations with respect to a para- 
meter and integration cannot generally be executed in reverse order 
(cf. the example on p. 473). 

In order to determine whether the order of integration in improper 
repeated integrals is reversible, we can often use the following test 
(or else make a special investigation along the lines of its proof): 


If the improper integral 
(54a) F(x) = |; F, y)dy 
converges uniformly in the interval a < x < P, then 
B co c> B 
(54b) fi dx f, Keady = f, dy f, fx, vd. 
To prove this we put 
o A 
f, fla ndy = |; Fæ y)dy + Raa). 
By hypothesis, | Ra(x)| < &(A), where e(A) depends only on A, not 


on x, and tends to zero as A — œ. The theorem on p. 80 on inter- 
changing the order of integration yields 


B iad B A B 
Í, dx Í, f(x, y)dy = ji dx f f(x, y)dy + f Ra(x)dx 
A B B 
= f dy f f(x, y)dx + f Ra(x)dx, 
whence by the mean value theorem of the integral calculus 


f dx | f(x, y)dy — in dy [° fx, dx < &(A)|B — al. 
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If we now let A tend to infinity, we obtain the formula (54b). 

If the interval of integration with respect to a parameter is infinite 
also, the change of order is not always possible, even though the 
convergence may be uniform. It can, however, be performed if the cor- 
responding improper double integral exists (cf. Chapter 4, pp. 408 ff.). 
Thus, : 


(54c) J, dxf, fe ydy = |, dy [ Ax, yjdx 


if the double integral ff|f(x, y)|dxdy over the whole first quadrant 
exists. 

Formula (54c) holds since the improper double integral is independ- 
ent of the mode of approximation to the region of integration. In the 
one case, we approximate the integral by means of infinite strips 
parallel to the x-axis, and in the other, by strips parallel to the y-axis. 

A similar result also holds if the interval of integration is finite, 
but the integrand is discontinuous along a finite number of straight 
lines y = constant or on a finite number of more general curves in the 
region of integration. The corresponding theorem is as follows: 


If the function f(x, y) is discontinuous only along a finite number of 
straight lines y = ai, y = a2,..., y = ar and if the integral 


[fe ydy 


converges uniformly in x in the interval a < x < B, then in this interval 
it represents a continuous function of x, and 


(54d) f? dx f? fla, dy = f” dy [È Kx, yd. 


That is, under these hypotheses the order of integration can be 
changed. The proof of the theorem is analogous to the one for formula 
(54b) given above. 

It is equally easy to extend the rules for differentiation with re- 
spect to a parameter. The following theorem holds: 


If the function f(x, y) has a sectionally continuous derivative with 
respect to x in the interval a < x < B and the two integrals 


(5a) F= | flx,y)dy and f fax, y)dy 
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converge uniformly, then 
(55b) F(x) = f; f(x, y)dy. 


That is, under these hypotheses, the order of the processes of in- 
tegration and of differentiation with respect to a parameter can be 
reversed, for, if we put 


G(x) = f, falx, y)dy, 
then (54b) yields 
é é o o d 
[Ga)dx = [dx f fax, y)dy = f d [| fx, yx. 
The integrand on the right has the value 
a 
S? fæ, dx = FG, 9) — fla, 3); 
therefore, 
fË G@dx = FÆ) - FO); 
hence, if we differentiate and then replace € by x, we obtain 


OO = G(x) = f7 faa, y)dy, 


as was to be proved. 
We can similarly extend the rule for differentiation when one of 


the limits depends on the parameter x (see Chapter 1, p. 77), for we 
can write 


(i, fessddy = fia Heddy + f, fe addy, 


where a is any fixed value in the interval of integration. Then we can 
apply rules previously proved to each of the two terms on the right. 

As before our rules of differentiation also hold for improper in- 
tegrals with finite intervals of integration. 
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c. Examples 
1. We consider the integral 


[ew dy = 1 (x > 0). 
0 x 


If x => 1, this integral converges uniformly, since for positive values 
of A 


fi e7% dy < f e-¥ dy = eA, 


where the final bound no longer depends on x and can be made as 
small as we please if we choose A sufficiently large. The same is true 
of the integrals of the partial derivatives of the function with respect 
to x. By repeated differentiation, we thus obtain 


i 1 i 2 o n! 
TTY = — 2e- TY = — n eTTY = —— 
f, ve dy f, oe dy -EEEREN fy e-t dy enti 


In particular, for x = 1, we have 
— “ -y = 1 
I(n + 1) = f yne—Y dy = n! 


This formula was established differently in Volume I (p. 308). 
2. Further, let us consider the integral 


Again it is easy to convince ourselves that if x < a, where a is any 
positive number, all the assumptions required for differentiation 
under the integral sign are satisfied. By repeated differentiation we 
therefore obtain the sequence of formulas 


f dy _m 1 1 f gla 1 

o (x2 + y?)2? 2 2 3° o (x2 + y2)8 2 94° x70”? 
1 
2 


” dy _ nm 1+3+++(2n-3) 1 
o +y)» 2 


426 (2n — 2) x2n-1° 
From these formulas we can get another derivation of Wallis’s 
product for x (cf. Volume I, p. 281). For this we put x = vn to obtain 
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T (2n — 3) 


“dy nT, 13e ++ @n— 3) 
f, (1+ y?n} 2 2.4 (2n — 2) Y" 


As n increases, the left side converges to the integral 
f ee dy = l Vi. 
0 2 
To prove this, we estimate the difference 
€72 dy — ar? 
J, y o (1 + y?/n)" 
This difference satisfies the inequality 
—y2 — MM 
e eu dy vl (1+ y?n" 


—y2 y 
dy +f e74? dy + r OF yin 


1 
~ (L+ y?n)" 


dy +f ev? dy + x 


1, 
~ (1+ ¥/nyn 


since (1 + y?2/n)" > y2. But if we choose T so large that 


j 1 € 
—y2 — 2 
f, d TI? 


and then choose n so large that 


r 


as is possible in virtue of the uniform convergence of the limit 


dy < Ë 


—y2 
eu" — 5? 


ee 
(1 + y?/n)" 


lim (1 + y?/n)-" = e»? 
n—>æ 
(Volume I, p. 152), it follows at once that 


Se" - arats 


With the value of the integral of e~¥? from (25a), p. 415, this establishes 
the relation 
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153+ + (2n — 3) 1 
(56) lim 2-46 + (2n — E 


which is equivalent to formula (80) in Volume I (p. 282). 
3. With a view to calculating the integral 


siny y 
J =a, 


we shall discuss the function 
F(x) = f envy MY Jy, 
0 y 
This integral converges uniformly if x = 0, while the integral 


f e77 sin y dy 


converges uniformly if x = > 0, where 56 is an arbitrarily small 
positive number. Both these statements will be proved below. There- 
fore, F(x) is continuous if x = 0; and if x = 5, we have 


F(x) = — few sin y dy. 


Integrating by parts twice, we easily evaluate this last integral (see 
Volume I, p. 277): 


We integrate this to obtain 
F(x) = — arctanx + C, 


where C is a constant.! By virtue of the relation 


co > oo ty 0 
0 y 0 x x 


1Here arc tan x denotes the principal branch of that function, as defined in Volume I 


(p. 214). 
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which holds if x = 5, we see that lim F(x) = 0. Since lim arc tan x 
y= 


T — o 


= n/2, C must be 1/2, and we obtain 


F(x) = 5 — arc tan x. 


Since F(x) is continuous for x = 0, 
lim F(x) = F(0) = f sin dy, 
2-0 o J 

which gives the required formula 

57 f "siny gy 7 

(57) o y 8? 

(cf. Volume I, p. 589). 

We prove that 


°” siny 
ety —— d 
hey 


converges uniformly if x = 0. If A is an arbitrary number and kr is 
the least multiple of x that exceeds A, we can write the “remainder” 
of the integral in the form 


(v+1)x 


co . kr e co . 
zy SN Y n Siny Siny 
e~ty dy =| ety — dy + Í ety ——~ dy. 
J, y y A y Y p> vr y y 


The terms of the series on the right have alternating signs and their 
absolute values tend monotonically to 0. By Leibnitz’s test (Volume I, 
p. 514), therefore, the series converges and the absolute value of its 
sum is less than that of its first term. Hence, we have the inequality 


œo : (k+1)n : (k+1)r 
_, Siny __, |siny| f 1 2m 
sy 2 7 zy Pe vl Z a 
Jie y ay|< f, É y Y<], ASSA’ 


in which the right side is independent of x and can be made as small 
as we please. This establishes the uniformity of convergence. 
The uniform convergence of 


f ” e-2¥ sin ydy 


for x => 5 > 0 follows at once from the relation 
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e~Az e- 4b 


— =< < Z .— 
dy < | e7 dy ma E 


e77 sin y 


J, 
4. On p. 466 we learned that uniform convergence of the integrals 
is a sufficient condition for reversibility of the order of integration. 


Mere convergence is not sufficient, as the following example shows: 
If we put f(x, y) = (2 — xy) xye~*4, then, since 


f(x, y) = yy (ayer), 
the integral 


f, fe, ydy 


exists for every x in the interval 0 < x < 1; in fact, for every such 
value of x, it has the value 0. Therefore, 


dx [7 f(x, y)dy = 
f x | f(x, y)dy = 0. 
On the other hand, since 
Oo 
f(x, y) = 55 (ye) 
for every y = 0, we have 


[ fen y)dx = yes, 


and, therefore, 
in y È Ka, ydx = f, yer dy = fie dy =]. 
Hence, 
f, dx f, Næ nay + [ dy f; Kæ vax, 


d. Evaluation of Fresnel’s Integrals 


Fresnel’s integrals 
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-+0 . -4-0 
(58a) Fı = Í sin (t?) dt, F = Í cos (t2) dt, 


are important in optics. In order to evaluate them, we apply the sub- 
stitution t? = t, obtaining 


sin t cos f 
Fi = f Sdt, F: = f T dt, 


Here, we put 
-> = —o e`? t dx 
Vt vr J, 


(this follows from the substitution x = t/v#) and reverse the order of 
integration, as is permissible by our rules. (we first restrict the integra- 
tion with respect to t to a finite interval 0 < a < t < b, and then let 
a—0, b- 00%). 


F, = eal) dx f et sintdt, F: = = f dx | e-2t cos t dt. 
0 0 0 0 


Using integration by parts to evaluate the inner integrals, we reduce 
Fı and F: to the elementary rational integrals 


dx, Fz dx. 


n=- spa = itz 


The integrals may be evaluated from the formulae given in Volume I 
(cf. Volume I, p. 290); the second integral can be reduced to the first 


by means of the substitution x’ = 2, > both have the value <= 9 5 5 . Con- 


sequently, 
(58b) A= P= f. 


Exercises 4.12 


1. Evaluate i x"e-22 dx. 
2. Evaluate 


F(y) = f x¥-l(y log x + 1)dx. 
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. Let f(x, y) be twice continuously differentiable and let u(x, y, z) be 
defined as follows: 


u(x, y, zZ) = I, f(x + zcos ¢, y + z sin ¢)d¢. 


Prove that 
Z(Uzz + Uyy — Uzz) — Uz = 0. 


. If f(x) is twice continuously differentiable and 


u(x, t) = pi in f(x. yE — y2)?”? dy (p > 1), 


prove that 


Ure = 2 7 Ut + Utt. 


. How must a, b, c be chosen in order that 
f j ji j exp [— (ax? + 2bxy + cy?)]dx dy = 1? 
. Evaluate 


(a) ji i ji i exp [— (ax? + 2bxy + cy?)(Ax? + 2Bxy + Cy*)dx dy, 


(b) Í > Ii exp [— (ax? + 2bxy + cy?)\(ax? + 2bxy + cy?)dxdy, 


where a > 0, ac — b? > 0. 
. The Bessel function Jo(x) may be defined by 


Jo (x) = 1 1 = cos xt — dt. 


Prove that 
1 
Jo” + POLJ + Jo = 


. For any nonnegative integral index n the Bessel function Jn(x) may be 
defined by 


xn +1 
Jala) = irg: On Dad, (008 x) — P)" dt, 
Prove that 
(a) Jn” + = J +(1 — ATA =0 (n 2 0), 
(b) Inia = In-1 — QI n’ (n = 1) 
and 


Jı = — do’. 
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9. Evaluate the following integrals: 
(a) K@a@= Í “ e-t? cos x dx 


© y-b% __ p~-a 
(b) f ene cos x dx 
0 x 


(c) I(a)= Í j exp (— x? — a?/x?) dx 


(a) [a 


where Jo denotes the Bessel function defined in Exercise 7. 
10. Prove that 
f” sin? ax 
0 x 


dx 


is of the order of log n when n is large and that 
f ” sin? ax — sin? bx 
0 


1 a 
x dx = 5 log b` 


11. Replace the statement ‘The integral N f(x, y) dy is not uniformly 


convergent” by an equivalent statement not involving any form of the 
words “uniformly convergent”. 


4.13 The Fourier Integral 


a. Introduction 


The theory given in Section 4.12 is illustrated by Fourier’s integral 
theorem (see Volume I, p. 615), which is fundamental in analysis and 
mathematical physics, We recall that Fourier series represent a 
sectionally smooth, but otherwise arbitrary, periodic function in 
terms of trigonometric functions. Fourier’s integral gives a cor- 
responding trigonometrical representation of a nonperiodic function 
f(x) that is defined in the infinite interval — œ < x < +0 and has 
its behavior at infinity restricted in a suitable way to ensure con- 
vergence. 

We make the following assumptions about the function f(x): 


1. In any finite interval f(x) is defined, continuous, and has a 
continuous first derivative f'(x), except possibly for a finite number of 
points. 

2. Near each exceptional point f'(x) is bounded. At an exceptional 
point, f(x) takes as its value the arithmetic mean of the limits on the 
right and left: 
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(59a) fle) = Z fle + 0) + fle — Ol 
3. The integral 
(59b) J. f@lax = c 


is convergent. 


Then Fourier’s integral theorem states: 
1 o0 co 
(60) fix) = = Í, de |  f®) cos x(t — x)dt. 
Using the identity 
Ly i- _ 
cos q(t — x) — 3 (ett itz + e ttt-+it2) 
and putting 


1 rt, 
(61a) g= ggj foe ae, 
we can write formula (60) in the form 


1 oo 
f(x) = =l, [etttg(1) + e-t%g(—1)] dt 


. 1 p4 
= lim va | [e** g(t) + e-*%g(—1)] dt 


A 0 


e 1 A 4 
= a TT 
lim on iv g(t)e! dt, 


Hence, Fourier’s theorem becomes 


(61b) f(x) = zS g2(t)e'** dt. 


1For an exceptional x we do not require that f'(x) be defined. However, the bounded- 
ness of f’ near an exceptional x implies that the limits f(x — 0) and f(x + 0), from 
the left and right, exist. 
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In the complex form, (61a) associates with a function f(x) another 
function g(t), the Fourier transform of f. Fourier’s theorem, as given 
by formula (61b), expresses fin terms of gin a quite symmetric fashion; 
as a matter of fact, it just states that f(— x) is the Fourier transform of 
g(t). The relation between f and g is reciprocal except for the sign of 
the exponent and the fact that according to our derivation from (60) 
the improper integral in (61b) is to be taken in the restricted sense 


co . A 
Í = lim . 


In formula (61a) for g, however, the integral is absolutely convergent 
by assumption (59b), and the upper and lower limits can tend in- 
dependently to +20 and — œ, respectively. The two formulas (61a, b) 
are reciprocal equations, each yielding the one function in terms of 
the other. 

The Fourier transform g(t) of a real-valued function f(x) generally 
takes complex values. From (61a) we obtain the complex conjugate 
equation for a real f, 


(62) O= Jer [Het dt = g). 


When f(x) is an even function of x, however, the Fourier transform g 
is even, too, and is real for real f. Indeed, combining the contributions 
of t and —t in the integral (61a), we obtain 


(63a) g(t) = an f(t) cos (tt) dt, 


which implies that g(t) = g(—t). Formula (61b) can then be written in 
the form 


2 oo 
(63b) f(x) = Te | g(t) cos (tx) at 
== f cos (cade f7 AO cos (xia, 
Similarly, for an odd function f(x), 


(64a) g(t) = = f f(t) sin (tt) dt. 
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In (64a), g is an odd function with values that are pure imaginary for 
real f. The reciprocal formula becomes 
64b -2È 5) sin ad 
(64b) f(x) = on Í g(t) sin (t)dt 
-2 f sin (tx)dt f f(t) sin (tt)dt. 
T Jo 0 


We illustrate Fourier’s integral theorem by examples and then 
proceed to its proof. 


b. Examples 


1. Let f(x) be the step function defined by f(x) = 1 when x° < 1, 
f(x) = 0 when x? > 1. By formula (63a) the Fourier transform of f is 
the function 


_2 sint 


2 1 
g(t) = Jon f cos (tt)dt = Von r 


Hence, by (63b), 


1 for lx|<1 
(65a) f(x) = 2 f cos (tx) sin T gy = > for x=+1 
0 for |x|>1. 


This integral appears in mathematical literature under the name of 
Dirichlet’s discontinuous factor. It shows that an integral can be a 
discontinuous function of a parameter x although the integrand is 
continuous in x. Of course, this phenomenon can occur only because 
the integral is improper. 

2. Let f(x) = e-¥7 for x > 0, where k is a positive real number. 
Defining f as an even function for all x, we find its Fourier transform: 


2 ¢° 2 k 
g(t) = z |. cos (tt) e~*t dt = 2 R24 72 


[see formula (64), p. 277, of Volume I for the evaluation of the integral]. 
By (63b) this leads to the equation 


(65b) f(x) = =f E cos (ta) dt = e-k! zi. 
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On the other hand, continuing e~** as anodd functionof x for negative 
x, we obtain the Fourier transform 


— _ at 2i > : — — ° J? T 
g(t) = on f sin (tt) e*t dt = —i RE TET 
and the formula 


ekz for x>0 
(65c) f(x) = ef = nkp dr= } 0 fr x=0 
—ekt for x< 0. 


3. The function f(x) = e-**/2 gives an interesting illustrationof our 
reciprocal formulas. The Fourier transform is 


g(t) = a f e-%7/2 cos (xt) dx. 


We are handicapped in evaluating g by the fact that no explicit 
expression for the indefinite integral is available. Curiously enough, 
g can be found by solving a differential equation. On differentiating 
the expression for g and integrating by parts, we obtain 


g’(t) = — Tom aN (xe-2"/2) sin (xt) dx 
= — [e-7*2 sin (|. =a e77?12 cos (xt) dx] 
V27 E 0 0 
= —tg(t). 
It follows that 
d 12/12] — e? = 0 
S- ge] = (gt + g'e?’ = 
or that 


g(t)e**/2 = constant = c. 
Hence, g is of the form 


g(t) = ce"? 
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Thus, the Fourier transform g of the function f = e~**/2 has the form 
g(t) = ce? 


with a certain constant c. Since [see (25a) p. 415] 
— — 2 i -zx212 — a -y2 = 
c= 80) = 2f e dx = 7 1? dy = 1, 
we find that the Fourier transform of f = e-**2 is the same function: 


(66a) g(t) = zÍ e72?12 cos (xt) dx = e77’, 
0 


c. Proof of Fourier’s Integral Theorem 
The proof (like the corresponding one for Fourier series in Volume 
I) is based on a simple lemma (“Riemann-Lebesgue lemma”): 


If $(t) is bounded and continuous in the open intervala < t < b, we 
have 


. b . 
(67) lim Í, g(t) sin At dt = 0. 


For the proof of the lemma, we assume that |¢(f)|< M fora < t <b. 
Let £ be a prescribed positive number. Let a and B be chosen so 
that 


E p- È 


M?’ m<’ <4 a <B. 


a<a<a+ 
Then, 
b . B . 
| Í g(t) sin At dt < | f g(t) sin At dt| + 2e. 


In the closed interval a < t < ß, the function g(t) is uniformly con- 
tinuous and we can find a ô such that 


IK- O< gyi for -tlà 
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Now, replacing t by t + n/A in the integral we have 


f ° SÐ sin At dt = — Í D alt + A} sin At dt 
= — f g(t) sin At dt 
— f Pond sl + A — K| sin At dt 
+ i g(t) sin At dt 


— B glt + 4} sin At dt. 


Hence, if A is so large that n/A < 5 and 2Mn/A < e, we find that 


B , B—a—rrjA 2Mn 
2 Ki) sin At dt| < P—S— RIA e+ < þe, 


and, thus, also 
S g(t) sin At dt| < 3e. 


Since ¢ is arbitrary, the relation (67) follows. 

It is clear that formula (67) holds more generally, namely when, 
by removing a finite number of exceptional points, the interval a < t 
< b can be broken up into open intervals in each of which g(t) is con- 
tinuous and bounded. 

Now let f(t) be a function defined for all ¢ that satisfies the as- 
sumptions 1-3 stated on p. 476-7. In order to prove our main theorem 
in the form (60), we first replace the infinite intervals of integration by 
finite ones so that we may reverse the order of integration. For 
positive A, B, (and a fixed x), we introduce the expression 


(68a) Ia = 5 Í, f dt ji j f(t) cos q(t — x) dt. 


By assumption 3, 


f IOl 
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converges. Consequently, given > 0, we have 
Saf cos u(t — x) dt| < f a fOldt <e 
for all sufficiently large B. It follows that 
(68b) lim fÙ Ae) cos x(t — x) dt = [7 fO) cosa — x) dt 
Boo J-B — 00 


converges uniformly in Tt. 
Formula (60), which we want to prove, states that 


(69) f(x) = lim Ta. 


In the integral (68a) defining I}, we can interchange the integrations 
[see (54b), p. 466] since the integral (68b) converges uniformly.! Thus, 


ls = a dt in f(t) cos q(t — x) dt 
= 21" p Ae x) 4 = a ” F(t + x) OAL a, 


Using the identity 


[ A aas for A>0 
0 t 2 


[see (57), p. 472], we can write this result in the form 


=) (e+) + fe- yA a 


tasea, g(t) sin At dt 
- M+ O+ie-9, 1 f7 g(t) sin Atdt + È f7 gO) sin At dt, 


1We apply the theorem on p. 466 separately to 
Q f(t) cos t(t — x)dt and f À f(t) cos t(t — x) dt. 


The function f may have a finite number of jump-discontinuities in any finite interval 
without changing the proof of (54b). 
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where C is any positive constant and 


yy =F +9 =fE+O) , fle— 9 —fle— 0) 


The function g(t) satisfies all the assumptions of the Riemann-Lebesgue 
lemma (67): It obviously is continuous except possibly at a finite 
number of points, since this is true for f. At a point of discontinuity 
t + 0 the function g(t) stays bounded, since f has jump-discontinuities 
only. The boundedness of g(t) near t = 0 follows from the differenti- 
ability of f and the boundedness of f’, since by the mean value theorem 
of differential calculus, 


g(t) = f'(x + Ot) — f'(x — nt), 


where 0 and ņ are certain values intermediate between 0 and 1.! 
Applying (67), we conclude that for any c > 0 


A~ 


. 1p° . 
lim x i g(t) sin At dt = 0. 


Moreover, 
{ao sin At dt = Apri D4 fe 2 sin At dt 


f(x + 0) + f(x — 0) ” sind | 
— A | dt 
T ac t 


Here the second integral tends to 0 for A —> œ and any C, whereas by 
choosing C sufficiently large, the first one can be made arbitrarily 
small uniformly for all A > 0. It follows that 


lim I, _ f(x + 0) + fx — 0) 
A> 2 


This is equivalent to (69), since we assumed that 


1Notice that to apply the mean value theorem we only require existence of the 
derivative in the interior of the interval and continuity in the closed interval (see 
Volume I, p. 174). These assumptions are satisfied by the function defined by f(x + t) 
for small positive t and by f(x + 0) for t = 0, as well as for the function defined by 
f(x — t) for small positive t and by f(x — 0) for t = 0. 


Multiple Integrals 485 


d. Rate of Convergence in Fourier’s Integral Theorem 


The reciprocal formulas (61a, b) have been established under the 
assumptions 1-3 on the function f(x) stated on p. 476-7. A consequence 
of the requirement 


J M@|dz = C< 0 


is that the Fourier transform g(t) given by (61a) is absolutely and uni- 
formly convergent. Indeed, if we put 


(70a) galt) = Joe [fer a, 


then 
Ja) = go] = |e S., fde at 


1 
< Ta f Old. 


Hence, given £ > 0, it is possible to find a B so large that 
|g(t) — ga(t)|<e for allt. 


It follows that g, as uniform limit of continuous functions gaz, is itself 
continuous. 

We cannot be sure in general of the uniform convergence of the 
integral in the reciprocal formula (61b). The approximating functions 


1 A 
(70b) fa(x) = Von i 1 g(t) er dt 


certainly are continuous and converge to f(x) for each x. However, 
the convergence cannot be uniform if f has discontinuities, as in our 
Example 1 on p. 479. Sufficient for uniform convergence of the fa(x) 
toward f(x) is again the existence of the improper integral 


f loldi. 


This condition clearly is violated in the example mentioned, where 
g(t) = 2 sin t/V2n t. 
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For many applications, it is convenient to work only with integrals 
that are uniformly and absolutely convergent. Interchanges of limit 
operations are usually much harder to justify for integrals that 
converge only conditionally. It is easy toimpose additional restrictions 
on f that guarantee the integrability of |g| over the whole axis, and, 
hence, the uniform convergence of the fa(x). It is sufficient to require 
that f(x) have continuous first and second derivatives f'(x) and f(x) 
and that all three integrals 


[lif@lax, [ir @lds, [ir @ldx 


are convergent. 
First, the convergence of 


{lf @lde 


implies that 


lim f(a) = lim [(0) + [* Pod] = FO + [Pod 


exists. Obviously, 


lim f(x) 


vo 


can only have the value 0, since otherwise 
J teas 


could not converge. Thus, lim f(x) = 0 and, by the same argument, 
~o 


lim f(x) = 0. Similarly, the convergence of 


[relax 


implies that 


lim f’(x) = 0, 


g>to 
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also. Integration by parts applied twice to formula (70a) yields 


1 B 
Ta) — gal) = Faki | -ABP + Boe + f° Peat 
_etBf(B) + inf(B)] — e's [f(—B) + itfi B) 


V2n T? 
1 0? engyy ict 
_ Von aff (t)e tt dt. 
Hence, for B — oo 
1 te / i 1 > 1" i 
(71b) g(t) = wg] f'(He* dt = — Jon 2 Í f" (et dt 
and thus, 
1 te 1 
(To) Ol S Beal wold=07). 
This estimate for g(t) clearly implies that 
{_ le@)|de 
converges (see Volume I, p. 307) and, hence, that 
. . 1 p4 i 
f(a) = lim fala) = lim zzz | ge de 


uniformly for all x. In fact, under the assumptions made on f, it does 
not matter how the upper and lower limit in the integral tend to 
+ œ; in general, 


_ 1 74 
—_ _— TT 
f(x) = a an J a(tje* dt. 


Equation (71b) can be interpreted as stating that the function f'(i) 
has the Fourier transform itg(t) and f”(t), the Fourier transform 
—t2g(t), where g is the Fourier transform off. Thus, under suitable reg- 
ularity assumptions differentiation of f corresponds to multiplication 
of the Fourier transform of f by the factor it. This fact is crucial for 
many applications of the Fourier transformation. 
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e. Parseval’s Identity for Fourier Transforms 


For Fourier series, we proved (Volume I, p. 614) the Parseval 
identity connecting the integral of the square of a periodic function 
with the sum of squares of the Fourier coefficients. A remarkable anal- 
ogous identity exists for Fourier integrals; it is even more symmet- 
ric in form because of the reciprocity between a function f and its 
Fourier transform g. Since, even for real f, the Fourier transform g 
will generally be complex-valued, one has to use the square of the ab- 
solute value rather than the square of the function. The Parseval 
identity then states that the integral of the square of the absolute 
value extended over the whole axis is the same for the function f and 
its Fourier transform g: 


(72) S olde = f” gld. 

We shall not prove this identity under the most general assump- 
tions for which it holds, but merely for f restricted in the same way as 
at the end of the last section, namely, when the three functions f, f’, 
f” are all continuous and absolutely integrable over the whole x-axis.! 


As before, we define the approximations ga(t) to g and f4(x) tof 
by the equations (70a) and (70b). Then we form the expression 


Jas = fË I — fala |? dx 
= [Ë O -AM — F dx 
= [° (fe) AG) -O — A + faa) dx, 


where the bar above an expression indicates the complex conjugate 
value. Now, interchanging integrations, we find that 


f f(x)fa(x)dx = = ii f(x)dx f g(t)e-!2* dr 
= Jag wa fae ae 
= f g(t)ga(t) dr, 


1The identity can be extended to more general f by suitably approximating f by 
functions of the restricted class used here. 
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whence, taking the complex conjugate, we find 
B o A 
J fa) f(x)dx = f (800) Bat) dr. 
Hence, 


(73) Jan = J (FGI? + [faa ddz 


A — _ 
— |“ EO gs0) + 80) Badr. 
Since our assumptions about f(x) guarantee that 
lim fal) = f(a) 
uniformly in x (see p. 487), we also have 
lim | f(x) — fala)| = 0 
uniformly in x. Consequently, 
, , B 
lim Ja, z = lim |” |f(x) — fa(x)|? dx = 0. 
A A~% Y-B 
Thus, identity (73) yields for A — œœ 
B ofpe — 
(14) = 2 f IM? dx — J" BO gas) + g0) gdr. 
Since 
lim ge(t) = g(t) 
uniformly in t and since ga(t) is bounded uniformly, and 


g(t) = of). 
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we can let B tend to œœ in identity (74) to obtain in the limit the Par- 


seval relation (72). 
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f. The Fourier Transformation for Functions of Several Variables 


In one dimension the Fourier integral identity yields a representa- 
tion of a function f(x) as a linear combination of exponential func- 
tions e% that depend on a parameter £. For each value & of the 
parameter, we multiply the function e% with a suitable ‘weight 
factor” g(&)//2x and integrate with respect to &. The appropriate 
factor g(E) is the Fourier transform of f. 

Similar formulae exist for decomposition of functions of several 
variables into exponential functions. Functions f(x, y) of two inde- 
pendent variables x, y are represented as combinations of exponential 
functions of the form e*“ty” that depend on the parameters £, n. 
Similarly, functions f(x, y, z) of three independent variables are built 
up from exponentials e*“*+¥1*+) depending on the parameters, €, n, C. 
Such decompositions of general functions into exponentials constitute 
one of the most powerful tools of mathematical analysis. For a given 
set of parameters €, n, ¢ the function ett) depends on the single 
combination s = x% + yn + 26, which is constant along each plane 
with direction numbers &, n, 6 in x, y, 2-space. If we introduce a new 
rectangular coordinate system in which one of these planes is a coordi- 
nate plane, then e#tyn+) becomes a function of a single coordinate. 
In this way, Fourier’s formulae yield a decomposition of f(x, y, z) into 
functions that depend only on a single coordinate (where, however, 
the direction of the corresponding coordinate axis depends on the 
parameters €, n, 6). | 

Such exponential expressions are intimately connected with the 
plane waves encountered in physics. Multiplying the exponential func- 
tion e@(z&+yn+z) by a time dependent exponential factor e—t®t, we obtain 
the expression 


(75a) u(x, y, Z, t) = et(tetunte) e-tiwt = et(ttnytiz—at) | 


Here u has a fixed value et! for all times ¢ at all locations (x, y, z) with 
the same ‘‘phase”’ value 


s = x% + yn + 26 — ot. 


For fixed s, this represents at each time ¢ a plane (“wave front”) in 
x, y, z-space with direction numbers ¢, n, ¢ for its normal. As t varies, 
this plane moves parallel to itself. Since (see p. 135) the quantity 


_ s+ at 
P= ajte 
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is the distance of the plane from the origin at time t, the plane moves 
with speed 


d O 
(75b) e= dt © VEFFE 


This is the speed of propagation of the wave fronts, corresponding 
to a “frequency” œ of the wave. 

We shall state and prove the Fourier integral theorem for a func- 
tion f(x, y) of two independent variables under conditions on f that are 
sufficient for the validity of the theorem (although far from necessary) 
and are convenient for applications. 

Let f(x, y) be defined and have continuous derivatives of first, second, 
and third orders for all values x, y. The absolute values of f and its 
derivatives of order < 3 shall be absolutely integrable over the whole 
plane; that is, for any nonnegative integers i, k with i + k < 3 the 
improper integrals 


(76) ffs 


extended over the whole x, y-plane, shall converge. The Fourier trans- 
form g(&, n) of f is defined by the formula 


otk f(x, y) 


Ixi ays dx dy, 


(77a) gE, n) = z [f ee f(x, y) dx dy. 


The function f is then expressed in terms of its Fourier transform by the 
reciprocal formula 


1 
(77b) fix, y) = zz |f vm gE, n) dé dn. 
Here, all integrals are extended over the whole plane and converge ab- 
solutely. 
An analogous statement holds for functions f(x1, . . ., xn)of n in- 


dependent variables. We only have to assume that f and its derivatives 
of order < n + 1 exist and are absolutely integrable over the whole 
space. The Fourier transform g(&1, &2, . . ., Én) is then defined by 


(77a) g = (2n)-"”? f- . Í e ltikt +n bn) f(x1, ..., Xn) dxi -+*+ AXn. 
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The reciprocal formula for f(x1,..., xn) here becomes 

(TTb) f= (2r foes f etat tnin gb, . -s En) dér +++ dén, 
The proof for n dimensions is exactly the same as the proof for the 
two-dimensional case that will be given now. 

We shall first prove the Fourier integral theorem for a function 
f(x, y) of class C? and of compact support, meaning that f has continu- 
ous derivatives of order < 3 and vanishes outside some bounded set. 
For this situation the Fourier formula for f follows immediately from 


the formula for functions of a single variable, as we now show. 
The Fourier transform 


1 
gE, n) = 5 | e f(x, y) dx dy 
is given by a proper integral, since f vanishes outside a bounded re- 


gion. Introducing the “intermediate” Fourier transform with respect 
to y alone, namely, 


(77c) y(x, n) = al etn f(x, y) dy, 
we can write g in the form 
até, n) = ga | er (e, m de, 
Obviously, for each value of n, we have in y(x, n) a function of the 


single variable x of class C? and of bounded support. Its Fourier trans- 
form is g(E, n). The theorem of p. 477 applies and yields 


(78) va, n) = Joe [et e6, n) ab. 


On the other hand, y(x, n) for fixed x is the Fourier transform of f(x, y) 
considered as a function of y alone. Hence, the reciprocal formula 


1 
f(x, y) = Von f etyn y(x, n) dn 


holds. Substituting here for y its expression from (78) yields 


fæ = g f dn f eeo gE, n d, 
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In this formula, the repeated integral (first with respect to & and then 
with respect to n) can be replaced by a double integral over the whole 
€, n-plane, which leads to formula (77b). This step is valid (see p. 466), 
since the single integral 


(79a) JT l6, nI ae 


converges uniformly in ņ for all n and, in addition, the double integral 


(79b) {J 12E, nI dé dn 


converges. Both convergence results follow if we can show that an es- 
timate of the form 


M 
(79c) 86 WIS FEE nane 


holds for g with a suitable constant M. The convergence of the double 
integral (79b) is a consequence of (79c). The uniform convergence of 
the single integral (79a) follows from (79c) since for A > 1 


dé 
dt < M 
Jig G@mieesMl rera 
2|5 | M 
< M = ‘ 
= 1§|]>A G+ eg% 1+ A?’ 


the right side tends to 0 for A — œ independently of n. 
Inequality (79c) is established from (77a) by repeated integration 
by parts. Since f has compact support, we find that 


[J ett EE) ax dy = Onti8)@G, n) 


3 
[f erein PERD ae dy = Ontin)*@(G, n) 
and, hence, that 


2n(1 + 161° + n1’) lat, n)| 
= 27| ga, n)| + |2r(i6) gE, n)| + | 2n(in)?g(E, n)| 


< {f {lace 1+ R2] + PERD) ae ay, 
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For any &, n let the largest of the three quantities 1, |&]|, |n| be denoted 
by ¢. Then 


(1 + €? + n3)? < (C2 + G + C92 =3v8 C8 < 3V3(1 +1613 +In1%). 


This yields the inequality (79c) with the value 
73 33 
ib) M= 38 [fiire p+ a 


0x3 
for the constant and completes the proof of the Fourier theorem for 
functions f(x, y) of class C? and of compact support. 

The proof of the theorem for the most general f of class C? for which 
the integrals (76) converge follows by approximating such f by func- 
tions fn(x, y) of compact support. For this purpose we multiply f(x, y) 
with a suitable “cut-off” function ¢n(x, y) so that the product fn = ¢af 
has compact support, but agrees with f in the disk x? + y? < n?. Here 
we only require an auxiliary function ¢n(x, y) with these properties: 


+ | Pf, y) Jae dy 


ay? 


1. ¢n(x,y) has compact support and belongs to C3; 

2. dn(x,y) = 1 for x? + y2? S n?; 

3. The absolute values of ¢n(x, y) and of all its derivatives of orders 
< 3 do not exceed a fixed quantity N independently of x, y and n. 


Suitable functions ¢n can be constructed easily ina variety of ways. 
Denote by ga(é, n) the Fourier transform of fn = $nf: 


(80a) Enl, n) = = f etzetun) n(x, y)f(x, y) dx dy. 
Then 


la(é,n) — ga, m)l =| zs [feted — gaf dx dy 


1For example, define the function h(s) by 


1 for sS0 
h(s) = | (1 —s‘)4 for 0OSs3S1 
0 for 1ss. 
Then 
gnl, y) = h(x — n)h(—n — x)h(y — n)h(—n — y) 
has all the desired properties. 
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1 
< r Sieur an Z SFI de dy 
<(N+1) ff bys nt ifl dx dy. 


From the assumed convergence of the integral of |f] over the whole 
plane it follows that 


(80b) lim gaẸ, n) = gÉ, n) 


uniformly for all (E, n). In order to see that g(E, n) again satisfies an 
inequality of the form (79c), we observe that by Leibnitz’s rule 


afa] | a 
Jè | = |a oof 
98 2? a 
<n |25 +a + 3| +(f)), 


A similar estimate holds for the third y-derivative of fn. Let I be the 
largest of the integrals taken over the whole plane, of the absolute 
values of f and its derivatives of orders < 3. Then 


Applying the inequality (79c, d) to the function fn, we find that for any 
n and all €, n, the inequality 


+ |dx dy < (1 + 8 + 8) NI = 17NL 


03 
aya! 


M 
(80c) lgn(S, |S (1 + E2 + 232 
holds with 
mM = 2143 Ni 
21 


It follows from (80b) that 


M 
lé Wl S Gye apie 


for all (E, n), with the same constant M. 
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Since fn has compact support, the reciprocal formula 
1 
(804) fol, Y) = 5 || E ga(E, n) dé dn 


is known already to be valid. For a given (x, y) we have fn(x, y) = 

f(x, y), once nis so large that n? > x? + y?. For n > œ we obtain then 

from (80d), using (80b) and (80c), the reciprocity law (77b) for f itself. 
Parseval’s identity for multiple Fourier integrals takes the form 


(81) [fife 91? dx dy = [Plat n)? dë dn 


where the integrations are extended over the whole plane. The proof 
can be carried out by exactly the same arguments as those used in Sec- 
tion e, p. 488, for the Parseval identity for functions of a single vari- 
able, provided we make the same assumptions about f(x, y) as for the 
derivation of the Fourier integral formula. Modifying the expressions 
used on pp. 488 appropriately, we consider the integral 


Jas = ffy, eeg Me) — faz, y)? dx dy, 
where 


_i i(zé-+yn) 
Fale, 9) = z Shoa, a gn Oe” BE W AE dn 
gr(E n) = = f e—i(zk+tyn) f(x y) dx dy. 
? 2r JJr2+y2<B2 , 


Here, instead of (73) we obtain the identity 


Tae = [fiy ap (Ife aI? + [fale 9919) dx dy 
- [Pose 2 BED EG, n) + gE, N gE WI d dn. 


For A > ~ and B — œ the identity (81) follows in the same manner 
as before. 
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Exercises 4.13 


1. Find the Fourier transforms of the following functions: 


(a) f(x) = | 


c,for0<x<a 
0, for x < O0 or x >a. 


e-9% for x > 0, (a > 0) 
b = 
6) fG) e forx <0 


(c) Jn(x)/x"” (with Jn defined as in 4.12, Exercise 8). 


4.14 The Eulerian Integrals (Gamma Function)! 


One of the most important examples of a function defined by an 
improper integral involving a parameter is the gamma function I(x), 
which we shall discuss in some detail. 


a. Definition and Functional Equation 


In volume I (p. 308) we defined I(x) for every x > 0 by the improper 
integral 


(82a) I(x) = in et ¢2-1 dt. 


We can split up the integral into one extended over the unbounded 
portion of the t-axis from t = 1 to t = œ with acontinuous integrand 
and one extended over the finite interval from t = 0 to t = 1, where— 
at least for values of x between 0 and 1—the integrand is singular. 
The tests developed on p. 000 show at once that the integral (82a) con- 
verges for any x > 0, the convergence being uniform in every closed 
interval of the positive x-axis that does not include the point x = 0. 
The function T(x) is therefore continuous for x > 0. 

The integrals obtained by formal differentiation of formula (82a) 
also converge uniformly in any interval 0 < a < x < b. Consequently 
(see p. 465), I(x) has continuous first and second derivatives given by 


(82b) T(x) = f ” e-t7-1 log t dt 
(82c) T(x) = f “e-tt?-1 log?t dt. 


1A discussion related to the present one is given by E. Artin, The Gamma Function 
(English translation by Michael Butler), Holt, Rinehart and Winston: New York, 
1964. 
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By simple substitution the integral (82a) for I(x) can be trans- 
formed into other forms that are frequently used. Here we only men- 
tion the substitution ¢ = u?, which transforms the gamma function 
into the form 


T(x) = 2 J, ” e-u2y 22-1 du. 


Thus, for a = 2x — 1, 


° u2 a l+a _ 
(82d) Í, e-““ue du r (| (a >—1) 
[cf. formula (48d), p. 458]. 
As in Volume I (p. 308), integration by parts in formula (82a) yields 
the relation 


(83a) T(x + 1) = xI (x) 


for any x > 0. This equation is called the functional equation of the 
gamma function. 

Clearly, T(x) is not uniquely defined by the property of being a solu- 
tion of this functional equation since we obtain other solutions merely 
by multiplying I(x) by an arbitrary function p(x) with period unity. 
The expression 


(83b) u(x) = I(x) p(x) 
where 
(83c) p(x + 1) = p(x) 


represents the most general solution of equation (83a), for if u(x) is 
any solution, the quotient 


_ u(x) 
w) = Tia) 


[which can always be formed since T(x) # 0] satisfies equation (83c). 

Instead of T(x) it is frequently more convenient to consider the 
function u(x) = log T(x); this is defined for all positive x, since I(x) > 
0 for x > 0. The function satisfies the functional equation (a “dif- 
ference equation”) 


(83d) u(x + 1) — u(x) = log x. 
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We obtain other solutions of (83d) by adding to log T(x) an arbitra- 
ry function with period unity. In order to characterize the function 
log I(x) uniquely, we must supplement the functional equation (83d) 
by other conditions. One very simple condition of this type is given 
by the following theorem of H. Bohr and H. Mollerup: 


Every convex solution of the difference equation 
(84a) u(x + 1) —u(x) = log x 


for x > 0 is identical with the function log T(x), except perhaps for an 
additive constant. 


b. Convex Functions. Proof of Bohr and Mollerup’s Theorem 


A function f(x) with continuous second derivative is called convex 
(see Volume I, p. 357) if f” = 0. A more general definition, appli- 
cable even to functions that are not twice differentiable, is the 
following: 

The function f(x) defined in an interval (posssibly extending to 


infinity) ts called convex if for any values x1, x2 of its domain and any 
positive numbers a, B with a + B = 1 the inequality 


(84b) f(ax1 + Bx2) < af(x1) + Bf(x2) 
holds. Geometrically (84b) means that for any two points of the curve 


y = f(x) with abscissa x1, x2, the chord joining them never lies beneath 
the curve (cf. Fig. 4.20). 


| 
t 
i 
i 
i 
'f(X2) 
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' i 
fax, + BX2) i 
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Figure 4.20 A convex function. 
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For a twice continuously differentiable function f, we find, using 
the mean value theorem of differential calculus and the fact that a and 
B are positive numbers with sum 1, 


(84c) af(x1) + Bf(x2) — f(axi + Bxe) 
= Blf(x2) — f(ax1 + Bxe)] — aff(axı + Bxe) — f(x1)] 
= aB(xe — x1)f'(E2) — aB(xe — xi fi (G1) 
= aB(x2 — x1)(G2 — Df” (n), 


where 1, 2, n are suitable intermediate values with 
(84d) xı < bı < axı + Bre < Eo < x2, E< N< &e. 


It follows immediately from (84c) that (84b) is satisfied if f”(m) = 0 
for all n in the domain of f. Conversely, we find from (84b), (84c), using 
(84d), that f’(n) = 0; for fixed a, B and x2 — xı it follows from the con- 
tinuity of f” that f” (xı) = 0 for any xı in the domain. Hence, a twice 
continuously differentiable function f is convex in the sense of (84b) 
if and only if f” = 0. 

To be convex, a function need not be twice, or even once, dif- 
ferentiable. An example is furnished by f(x) = |x|. However, a convex 
function necessarily is continuous at interior points of its domain. 
This follows from the inequality 


(84e) f(x2) — fæ) — fxs) — f(xs) 


Xx2—-X1 x4 — X3 
satisfied by a convex function for any x: in its domain for which 
Xi < X2 < X3 < X4. 
To prove (84e) we write xz in the form 
x2 = axı + ßŘxs, 


where 


Then 
(xs) — f(xe) _ f(x2) — f(x) 
X3 — X2 X2 — X1 


_ afer) + fla) — flaxı + Br). g 
aB(x3 — xı) ~ 
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and, similarly, 


f(xa) — f(xs) _ f(xs) — f(x) > 0, 


X4 — X3 X3 — X2 


which implies (84e). In words, (84e) states that the difference quotients 
of the convex function f formed for disjoint intervals are increasing. 
It follows that 


fæ) — flea) — ED- fE) — fxs) — Ne) 


x- xX. ~~ €—& ~ %xX4— x 


for any values €1, 2 between x2 and x3. Thus, f satisfies a Lipschitz 
condition in the interval x2 < x < x3 and, hence, is continuous in that 
interval. For any x in the interior of the domain of f we can always 
find suitable x1, x2, x3, x4, showing that f is continuous at x. 

In order to prove that the function log I(x) is convex, it is sufficient 
to show that 


d2 log r _ I"T — r? 
dx? r2 


IV 


(84f) 0. 


The relation (84f) follows from the Cauchy-Schwarz inequality? for 
integrals, since, here by (82a, b, c), 


n 2 
12 — —t4#z-1 
T =f e~t log t dt) 
> oo 2 
— (J, (e-t/24/¢2-1)(e-t/2./¢2-1 log t) dt) 
< J e~t®-1 dt in ett?) log*t dt = IT”. 


1From the Cauchy-Schwarz inequality for sums (Volume I, p. 15) we find for any 
continuous functions f(x), g(x) and any subdivision of their domain by points x; into 
intervals of length Ax; that 


2 
(EredgaA) s< (Sf *(ai)Axi| (Faaa) . 
Refining the subdivisions we find in the limit the Cauchy-Schwarz inequality for in- 
tegrals: 


(S, Ròg) da) < (f? P da) (f” 8 dx). 


a 
This inequality is extended immediately from proper Riemann integrals of continu- 


ous functions to improper integrals by passage to the limit with respect to the 
domain of integration. 
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Now let u(x) be an arbitrary convex solution of the functional equa- 
tion (84a) for x > 0. We form the expression 


Un(x) = u(x + h) — Qu(x) + u(x — h) 


for 0 < h < x. Applying relation (84e) which is valid for convex u, 
we find for0< h < k < x that 


ve(x) — ua(x) = [u(x + k) — u(x + h)] — [u(x — h) — u(x — k)] 


_ uxt+tk)~—ulxt+h) wux—h) — ux — k) 
= (b-)| k—h 7 -h+k [z0 


For fixed x, therefore, v(x) is a continuous nondecreasing function of 
h. Now, the functional equation for u yields 


vi(x) = u(x + 1) — 2u(x) + u(x — 1) 
= [u(x + 1) — u(x)] — [u(x) — u(x — I] 
= log x — log(x — 1). 


Hence, for0<h<1<x, 


(84g) 0 = vo(x) < valx) 
= u(x + h) —2u(x) + u(x — h) 
< v(x) = log 7 x E 
Since 
. x 
lim log 7 im log 1 = 0, 


we find from (84g) that for every convex solution of (84a) 


lim [u(x + h) — 2u(x) + u(x — h) = 0 (O<h< 1). 


If then p(x) is the difference of two convex solutions of (84a), we find 
that also 


lim [p(x + h) — 2p(x) + p(x — h)] = 0. 


Since p(x) is periodic with period 1, so also is the function 


p(x + h) — 2p(x) + p(x — h) 
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and it approaches 0 as a limit for x —> œ. Obviously, sucha function 
must vanish identically. Hence, 


(84h) p(x + h) — 2p(x) + p(x — h) = 0 (0<h< 1). 


Let M = p(&) be the largest value of the continuous function p(x) in 
the interval 1 < x < 2. Then p(x) < M for all x > 0 and by (84h) 


2M = 2p(§) = p(E+h)+pG—-—h)S2M OSA<1})). 
Hence, 
p&-A)=pGE+h)=M (0<h<1), 
and since p has period 1, 
p(x) = M = constant (all x > 0). 


This shows that any two convex solutions of (84a) differ at most by a 
constant and completes the proof of Bohr and Mollerup’s theorem. 


c. The Infinite Product for the Gamma Function 


Bohr and Mollerup’s theorem can be used to derive the infinite 
products representations for the gamma function found by Gauss and 


Weierstrass. 
For any given function g(x) we can easily verify that a special solu- 
tion w(x) of the difference equation 


w(x + 1) — w(x) = g(x) 


is given by the infinite series 


w(x) =- Baa + J) 
=— g(x) — g(x + 1) — g(x + 2)—- es, 


provided that series converges. We cannot apply this observation di- 
rectly to equation (84a) with g(x) = log x, since the resulting series 
diverges. However, the difference equation for w = u” obtained by 
differentiating (84a) twice can be solved in this way. A special solu- 
tion of the equation 

(x > 0) 


(85a) w(x + 1) — w(x) = — 5 
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is given by 


(85b) w(x) = (x > 0). 


at 3 qn 
Here, the infinite series converges uniformly in every finite interval 
0 < x < b (see Volume I, p. 535) since 


1 1 
a í 5 = 0). 
(z +j ~ 7 (20) 
Consequently, w is continuous for x > 0. Moreover, term-by-term in- 
tegration of the series is permitted (see Volume I, p. 537) and leads 
to a function 


(85c) u(x) = -145 f Sf (E = aaa: 
1 


-A 
x j=1 \X + J j , 
where the series occuring in this formula again converges uniformly 


in any interval 0 < x < b. Thus v(x) + 1/x is a continuous function of 
x for x = 0 that vanishes for x = 0. By the foregoing construction 


(85d) u(x) = w(x) (x > 0). 
Since, by (85a, d), 


$ [ole + 1) — oy) = —4 (x > 0), 


it follows that 
(85e) u(x + 1) — u(x) = Fe +c (x > 0), 


where c is a constant. In order to determine the value of c, we observe 
that by (85e) 


-—c= lim la) + H — lim v(x + 1) = —v(1) 
1 7 


=] (= - + 
+ TS J 


=1+0-1)4@-)4@-)4..-20 
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Integration of (85c) leads to a function 


“| dé 


oo x 1 
(85f) U(x) = — log x -2f E -; 


= — log x — 5 og +j)— logj - 4, 


j=l 


where the infinite series again converges uniformly in any interval 
0 < x < b. As before we conclude that U(x) is a continuous function of 
x for x > 0 satisfying 


U(x) = v(x), lim (U(x) + log x) = 0 
(85g) U(x + 1) — U(x) — log x = constant = C. 
Here, 


C= lim U(x + 1) — lim[U(x) + log x] = U(1) 
z> xr-0 


eo . . 1 
—-> Joga + j) — logj— | 
j=l J 


— lim S oga + j) — logj— H 
n= j= 


=lim(1+5+-+++ 


ne 


A j — logn). 


It follows that C is identical with Euler’s constant 


| 11 1 
(85h) C=lim(1+ 5+5+-+++;,—logn| 


introduced in Volume I (p. 526). 
By (85g) the function 


u(x) = U(x) — Cx 
satisfies the difference equation 
u(x + 1) — u(x) = log x. 
Moreover, by (85b) 


u(x) = w(x) > 0 (x > 0), 
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so that u(x) is convex. Since, in addition, 
u(1) = U(1) — C = 0 = log I (1), 


it follows from Bohr’s theorem that u(x) and log F(x) are identical: 


(86a) log I(x) = — Cx — log x — 5 flog z H j - 3). 
j= 
Our derivation also shows that 
(x) _ ~_q_i_<s{(i_! 
(86b) ray = (C+ Ma) = -C- 5-3 - 7 z) 
d? log T(x) _ 1, &_1 
(86c) daz 7 Wla) =a t pa (e+e 


Forming the exponential function of both sides of equation (86a), 
we arrive at the Weierstrass infinite product for 1/T (x): 


(86d) = xec% JI (1 + HG e-z (x > 0). 


1. 
I T(x) j=1 
We can write (86d) in a slightly different form not involving the 
Euler constant C. From (86a), (85b), 


log I(x) = — log x + lim X; (= — log ==) — Cx 
n=% j=l 


. n, 1 
= — log x + lim | x (È = — C — logn] 


j=l 


+ x log n — Š log = | 


J 


— log x + lim | log n + + logj — S log (x +i), 
ne j=l j= 


Consequently, we obtain the formula 


1-2°3:- at — 1) nt 
(86e) T(x) = lim œ X(x + 1)(x + 2)(x + DE e(x+n— 1)” (x > 0), 


which is Gauss’s infinite product for the gamma function. 

The limit on the right-hand side of (86e) exists not only for positive 
values of x but all x + 0, — 1, — 2, . . . : for a given x let the positive in- 
teger m be chosen so large that x + m > 0. Then, replacing n by n + 
m under the limit sign, we obtain 
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lim ——1*2-**@—- 1) _, 
no X(x+1)(x + 2)+++-(x+n-—1) 
1-2++-(n+tm-—1) 


nee TYE TD -(xt+n+m-— 
x(x + J- Ta + m — 1)n*tm 

(x + m(x+m+1)--.(x+m+n- 1) 
-etm 

— x(x +1)... (x+m-1)' 


D (n + m) 


= lim 


n — æ 


Thus, we can use Gauss’s formula (86e) to define I(x) for all values of 
x other than zero or negative integers. When x approaches one of 
these exceptional values, I(x) becomes infinite. The extended func- 
tion I(x) obviously still satisfies the functional equation 


(86f) I(x + 1) = xI (x). 


d. The Extension Theorem 


The values of the gamma function for negative values of x can also 
easily be obtained from the values for positive values of x by means of 
the so-called extension theorem. We form the product I(x)I'(— x), 
which is 


. 1°-2++-+-(n-—1) 1-2 - (n — 1) n-z 
lim F: (x+n—1)” lim oE a- a” 


and combine the two limiting processes into one, to obtain 


Ena a 
rœ- x) = za lim {1 — (2/1) {1 — (2/2) + + + {1 — [x(n — PF} ’ 


provided x is not an integer. But, by employing the infinite product 
for the sine, 


from Volume I (p. 603), we obtain 


I(x)I'(—x) = — 
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Hence, 


T 1 


M=) = — x sin nx T(x) ` 


We can put this relation in a somewhat different form by calculat- 
ing the product I'(x)[‘(1 — x). Since by (86f) 


rd — x) = — xI(—x), 


we obtain the extension theorem 


(97a) r91 — x) = ere 


Thus, if we put x = 4, we have I'(4) = vz. Since 
r(5)= 2 f e=? du 
2 0 i 
we have here a new proof for the fact that the integral 
f " e? du 
0 


has the value 4vx (see p. 415). In addition, we can calculate the 
gamma function for the arguments x = n + 4, where n is any posi- 
(97b) r(n + 5] = (n — 3) (n — 5) ° 


tive integer: 
31 n/1 
2 2 -332 Ta) 
3 


_ (Qn — Den = 5) oe 1 


1 


e. The Beta Function 


Another important function defined by an improper integral involv- 
ing parameters is Euler’s beta function. The beta function is defined by 


(98a) B(x, y) = f * -1 — ùv- dt. 


If either x or y is less than unity, the integral is improper. By the cri- 
terion of p. 465, however, it converges uniformly in x and y, provided 
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we restrict ourselves to intervals x = £, y = n, where € and narear- 
bitrary positive numbers. It therefore representsa continuous function 
for all positive values of x and y. 

We obtain a somewhat different expression for B(x, y) by using the 
substitution t = t + 4: 


(98b) B(x, y) = f n 5 + 1) z 5 — 1) M d. 


—1/2 


If we now put t = t/2s, where s > 0, we obtain 

(38c) (2s)"*v-1B(x, y) = [° (6 + Ms — t- dt. 
If, finally, we put t = sin?¢ in formula (98a), we obtain 
(98d) B(x, y) =2f T2 sin2*-lg cos’v-l dg. 


_ We shall now show how the beta function can be expressed in 
terms of the gamma function, by using a few transformations which, 
at first sight, may seem strange. 

If we multiply both sides of the equation (98c) by e~?! and integrate 
with respect to s from 0 to A, we have 


B(x, y) f 4 e-28(2s)t+y-1 ds = f A -8 ds f "(6 + Ue — tv dt. 


The double integral on the right may be regarded as an integral 
of the function 


e~28(s + t)* (ss — t)y-t 


over the isosceles triangle in the s, ¢t-plane bounded by the lines s + t 
= 0 and s = A. If we apply the transformation 


o=s+t+, 


tT=sS-—l, 


this integral becomes 
7 Í J, e-S-t*o2-1qy-1 do dt, 


The region of integration R is now the triangle in the o,t-plane 
bounded by the lines o = 0, t = 0, and o + t = 2A. 
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If we let A increase beyond all bounds, the left-hand side, by (82a), 
tends to the function 


5 Bx, (e + y). 


Therefore, the right side must also converge and its limit is the double 
integral over the whole first quadrant of the ©, t-plane, the quadrant 
being approximated to by means of isosceles triangles. Since the inte- 
grand is positive in this region and the integral converges for a mono- 
tonic sequence of regions (by Chapter 4, p. 414) this limit is inde- 
pendent of the mode of approximation to the quadrant. In particular, 
we can use squares of side A and accordingly write 


B(x, y)F(x + y) = lim J’ f. 4 o-0-192-Itu-1 do dt 


= Í ” e-sg2-1 do f “e-tu-1 dt. 
0 0 
We therefore obtain the important relation! 


_ P@ry) 
(99a) B(x, y) = T(x + y)’ 
From this relation we see that the beta function is related to the 
binomial coefficients 


(” + m) = (n + m)! 
n |} nim! 


1This equation can also be obtained from Bohr’s theorem. We first show that B(x, y) 
satisfies the functional equation 


Bee + 1,9) =~ BG»), 


so that the function 
u(x, y) = T(x + y) B(x, y), 
considered as a function of x, satisfies the functional equation of the gamma function, 
u(x + 1) = xu(x). 


The convexity of log B(x, y) and, hence, that of log u(x) follows from the Cauchy- 
Schwarz inequality in the same way as that of log I(x) on p. 501. Thus, we have 


I(x + y) B(x, y) = T(x) + ay), 
and finally, if we put x = 1, a(y) =T(1 + y) Bd, y) =TQ). 
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in roughly the same way as the gamma function is related to the num- 
bers n! For integers n, m in fact, 


n+m)\ _ 1 
(99b) | m ETEEN IEEE ED: 
Finally, we mention that the definite integrals 
T n/2 
f ° sinet dt and j cos*t dt, 
0 0 


which by (98d) are identical with the functions 


p[eH 1) 1p (t, 2) 


lpja+1 
2 2 ’2 2 
can be simply expressed in terms of the gamma function: 


Jn T(1 + a/2) 


nla m/2 
(99c) Í, sint dt = f cost dt = a MaD 


f. Differentiation and Integration to Fractional Order. 
Abel’s Integral Equation 


Using our knowledge of the gamma function, we now carry out a 
simple process of generalization of the concepts of differentiation and 
integration. We have already seen (p. 78) that the formula 


aoa) F(x) = [dt = ey h E- ONO 


gives the n-times-repeated integral of the function f(x) between the 
limits 0 and x. If D symbolically denotes the operator of differentiation 
and if D-1 denotes the operator 


Joe + de, 


which is an inverse of differentiation, we may write 
(100b) F(x) = D-*f(x). 


The mathematical statement conveyed by this formula is that the 
function F(x) and its first (n — 1) derivatives vanish at x = 0 and that 
the nth derivative of F(x) is f(x). But it is now very natural to con- 


512 Introduction to Calculus and Analysis, Vol. IT 


struct a definition for the operator D-* even when the positive number 
À is not necessarily an integer. The integral of order à of the function 
f(x) between the limits 0 and x is defined by the expression 


(100c) D-¥f(x) = Fa f "(x — Y-fOdt. 


This definition may now be used to generalize nth-order differentia- 
tion, symbolized by the operator D” or d”/dx”, to uth-order differentia- 
tion, where pt is an arbitrary nonnegative number. Let m be the least 
integer greater than p, so that y = m — p, where 0 < p < 1. Then our 
definition is 


(101a)  Def(x) = D™D-of(x) = f ro) f “(Ce — Do-If(t) dt. 


A reversal of the order of the two processes would give the defini- 
tion 


Def(x) = D-eDmf(x) = ro) [ æ- ofe) dt. 


It is left to the reader (see Exercise 12) to employ the formulas for 
the gamma function to prove that 


(101b) De Dè?f(x) = D®D-f(x), 


where a and Bf are arbitrary real numbers. He should show that these 
relations and the generalized process of differentiation have a mean- 
ing whenever the function f(x) is differentiable in the ordinary way 
to a sufficiently high order for all x and vanishes for x < 0. In general 
D+f(x) exists if f(x) has continuous derivatives up to, and including, 
the mth order. 

In connection with these ideas, we mention Abel’s integral equa- 
tion, which has important applications. Since 


1 — 
rt) = v7, 
the integral of a function f(x) to the order 4 is given by the formula 


(102) D-12 f(x) = 7 i re dt = w(x). 
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Formula (102) is called Abel’s integral equation when it is to be 
solved for an unknown function f(x), the function y(x) on the right side 
being given. If the function w(x) is continuously differentiable and 
vanishes at x = 0, the solution of the equation is given by the formula 


(103a) f(x) = D' w(x), 
or 
(104) fx) = J- £ i rc dt. 


Exercises 4.14 


1. Verify that for nonnegative integral n, 


1\ _ (2n)! vx 
r(n +5) = nian 
2. Find I'(i — n) where n is a positive integer. 
3. Show that 
B(x, x) = 2-28», z): 
4. Prove 


pa [atte Te) 
fae * rf! 1) 


x 2 
5. Establish the following relations: 


1 x2n+1 n 1)2 Q2n 
(a) La font = nad!’ 


x2n (2n)! x 
O L am a= aera 


6. Prove that the volume of the positive octant bounded by the planes 
x = 0, y = 0, z = h and the surface x™/a™ + y™/b™ = 2/c, where m > 0, 
is 


h\2m TC + 1m)? 
ab (z) T(2 + 2/m) ° 


7. Prove that 


jr (5+% b2 + Zer yeter dx dy dz 
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10. 


11. 


12. 


taken throughout the positive octant of the ellipsoid x?/a? + 2/52 + 
z2/c2 <1 is equal to | 


a Tene i$ f(E Ptit- dk, 


(Hint: Introduce new variables é, n, ¢ by writing 


x? y zg E(1 — n) 
a tpt ans or x = ave(1 — n) 
2 22 e_/1 WF 
pt azé or = y= bveK(T -0 


2 
== Ent or Z= AEn 


and perform the integrations with respect to 7 and .) 


. Find the x-coordinate of the center of mass of the solid 
x l/n y lin 2 1/n 
(z) + (3) + (2) Sl, x2z0,y20,z20. 


. Find the moment of inertia of the area enclosed by the astroid x?/3 + y2/ 


= R?’ with respect to the x-axis. 
Prove that the (n + 1)-fold integral 


Í. ee ice + eee + Xn) xot} ee e XnItn-1dxo eee dxn 


taken over the positive orthant xx 2 0 for k=0,..., n bounded by 
the hyperplane xo + + • e + xn = 1 is equal to 


Thao). e e P@n) ("pay pangs + eta, — 
Fark oet tan) by 10 10 n`! dt. 


Prove that 
I(x)P (« + z) 
Tr(2x) 
(a) Show that for any positive real numbers « and ß 
D*D® f(x) = D® D(x) 


where the derivatives are defined by (10la) and f has ordinary 
derivatives up to (p + q)-th order that vanish at x = 0, p and q being 
the least integers greater than « and 8, respectively. 


(b) Under the foregoing conditions, is it always true that D°D®f(x) = 
D®* BF (x)? 


(c) Extend the foregoing result to the case in which « or 8 may be 
negative. 


= 2/7. 


922 
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Appendix: Detailed Analysis of the Process of Integration} 


A.1 Areas 


The area of a set S can be defined rigorously along the lines sug- 
gested by intuition, as explained on pp. 368. Essentially one uses a 
subdivision of the plane into squares by lines parallel to the coordi- 
nate axes. One adds up the areas of the squares completely contained 
in S. This yields a lower bound for the area of S. Adding up the areas 
of all squares having points in common with S, we obtain an upper 
bound for the area of S. If these lower and upper bounds converge 
toward one and the same value as the subdivision of the plane is re- 
fined indefinitely, we identify this common value with the area of S. 
This construction for the area of a region incorporates the same ideas 
of approximating the region from inside and outside by regions com- 
posed of rectangles that led us to the notion of the Riemann integral 
of a function f(x). 

The concept of area, as defined here, is named the Jordan measure 
(after one of the initiators of modern precise analysis) or content of 
S. This is not the only way to introduce areas. (An extremely important 
definition that applies to more general sets yields the so-called 
Lebesgue measure of S.) The Jordan measure, which will occupy us 
here exclusively, has the advantage of greater intuitive immediacy 
and is quite adequate for those portions of analysis that lie within the 
scope of this book. 

For simplicity, we shall work mainly in the plane. However, our 
treatment will apply to higher dimensions with only such changes of 
terminology as the replacement of the term area by volume, square 
by cube and so on. 


a. Subdivisions of the Plane and the Corresponding Inner and 
Outer Areas 


To define at the area of a set S in the x, y-plane, we use successive 
subdivisions of the plane into squares of side 1, 4,4, 4, . . . by 
equidistant parallels to the coordinate axes.2 The nth subdivision 
(where n is a positive integer) is produced by the lines 


1Before reading this Appendix the reader would do well to review the arguments 
leading to the Riemann integral in Volume I (pp. 192-195). 

2It is helpful at this stage to introduce area through a quite specific set of sub- 
divisions of the plane into squares. Later, it will turn out that much more general 
subdivisions lead to the same area. 
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i k 
(1) X= on? Y= 9n? 


where i and k range over all integers. The plane is then divided into 
the closed squares Rj given by 


L 
(2) ik? gn SHS 


i+1 k 
Qn? Qn 


Let now S be any bounded set of points in the plane.! We form ap- 
proximations from below and from above to the prospective area A 
of S by forming the sum A, of the areas of all squares R;; that are com- 
pletely contained in S, and the sum A; of the area of all squares R3 
that have points in common with S. Here the area of a square Rx 
that has side 2-7” is defined to be 2-2", Using the symbolic notation for 
relation between sets explained on p. 114, we have, accordingly,? 


(3) A=} 2%, AP= p 2 
Rii.cs Ri seo 


(see Fig. 4-1). 
It is clear from the definition that 


(4) 0< A, < At. 


As we pass from the nth to the (n + 1)-st subdivision, each square 
Rix is broken up into four squares R“+!. If Rf is contained in S, somust 
be its parts R”t1. If, on the other hand, a part R“*! contains a point 
of S, then the same holds for the whole square Ry. 

It follows? that successive sums satisfy the inequalities 


We see from (5) that the sums A, form a nondecreasing sequence 
with the upper bound Aj, hence, they converge to a limit, 


A” = lim Az. 


n-o 


1Areas, properly speaking, will only be defined for bounded sets, although an 
“improper” area is defined for some unbounded sets as limit of “proper” areas. 
2If no square Rž is contained completely in S, we put A, = 0. 
3We have used here that the sum of the areas of the four squares R*t} making up 
Rž equals the area of RZ, which, in this context, follows from the arithmetical 
identity 

4e2-2(n+1)— Q-2n, 
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Similarly, the sums A, form a nonincreasing sequence with lower 
bound A, and converge: 


A‘ = lim Aj. 
By (5), we have for all n 
(6) 0<A, <A s AtS At. 


We call A” the inner area and A” the outer area! of S. Every bounded 
set S has an inner and an outer area, which we denote by A (S) and 
A‘(S). 

The inner area A (S) has the value 0 if and only if S has no interior 
points, for a set with no interior points contains no square Ry, so that 
A,, = 0 for all n, and thus, A’ = 0. A set with interior points contains 
some square Rz for sufficiently large n, so that An > 0 for large n, 
and hence, A > 0. 


b. Jordan-Measurable Sets and Their Areas 


We call a bounded set S Jordan-measurable if the inner area A` 
and the outer area A’ of S coincide.? We denote the common value by 
A and call it the area or the Jordan measure of S: 


A-(S) = A*(S) = A(S). 


Note that for the squares Riy used in our definitions, the original 
notion of “area” and the new one, the Jordan measure, coincide. Each 
square R has the Jordan measure 2-2” in the sense of the general 
definition, since for S = Rg and m>n 


An (S) = (2m-")?2-2m = 2-2n, 
Aj, = [(2m-2)2 + 4(2m-n) 4 4]Q-2m = 2-2n 4 22-m-n 4 22-2m, 


More generally, any rectangle S with sides parallel to the coordi- 
nate axes: 


1The terms interior Jordan measure or interior content, or, respectively, exterior 
Jordan measure or exterior content, are also commonly used. 

Instead of using the phrase “the set S is Jordan-measurable,” we shall simply say, 
“S has an area.” The term measure has the advantage of being independent of 
dimension and can be used equally well for length in one dimension, as for area in 
two dimensions, and for volume in higher dimensions. 
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has the area (b — a)(d — c), as expected from elementary geometry; 
for, given a positive integer n, we can find integers a, B, y, 5 such that 


a2”"<asx(a+1)2”", B2” <b<(B+12” 
y2” <e < (y +12”, 82” <d<(6+1)2”. 


Then, 


A,(S) = @ — a — 1è — y — 12" > (b — a — 2d — c — 27, 
A(S) = (B — a + 1X(è — y + 1)2” < (b — a + Ż™)(d — c + 27”), 


so that for n > œ, 


lim 4,(S) = lim A (S) = (b — a)(d — ©). 


n= n 


Our next task is to find criteria for measurability of a set S. We 
shall prove quite generally that necessary and sufficient for a bounded 
set S to have an area is that its boundary ðS have area zero. 

In proof, consider a subdivision of the plane into squares R% and 
form the corresponding sums A, (S) and A; (S) as in (3). Obviously, 
A; — A, represents the sum of the areas of the squares R, that 
contain points in S as well as points not in S. Let on be the set of those 
squares. Each square of on contains a boundary point of S, for on the 
line segment joining a point P of Rin Sto a point Q not in S but in 
the same square R,;, there certainly lies a boundary point of S. Hence, 
each square of on has points in common with ôS, and consequently, 


A;,(S) — A(S) S A(S). 


If dS has area 0 (or, what is the same, outer area 0) the right-hand side 
tends to 0 for n > œ, and we find that A*(S) — A (S) = 0,orthat S 
has an area. 

Conversely, let S have an area, so that 


(7) lim [Ax(S) — A,(S)] = 0. 


A point P in the plane that for a fixed n belongs only to squares R; 
contained in S must be an interior point of S.1 Similarly, a point be- 


1Remember that our squares Rj, are closed. Hence, P could belong to as many as 
four squares. 
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longing only to squares free of points of S must be an exterior point of 
S. Let P be a boundary point of S. If P did not lie in any square of 
On, 1t would have to belong to a square contained in S as well as toa 
square free of points of S. But this is impossible since two such squares 
cannot have a common point. Hence, every P in 0S is contained in a 
squre R% of the set on. The total area of those squares is A;,(S) — A;,(S). 
Any square R% having a point in common with 0S either is then a 
square ino, or one of the eight neighbors of such a square, having a 
point in common with it. Hence, the total area of the squares R; hav- 
ing points in common with ðS cannot exceed nine times the total area 
of the squares in On: 


A,(9S) < AACS) — A;(S)]. 


Hence, (7) implies that A*(dS) = 0 and, thus, that dS has area 0. 

An example of a set that does not have an area A in our sense is 
furnished by the set of rational points in the unit square, that is, the 
set S consisting of the points (x, y), where x and y are rational num- 
bers between 0 and 1. Here the boundary Sis the set of all (x, y) with 
0<x<=1,0S y< 1 and, hence, has area 1. It follows from our theo- 
rem that S is not Jordan-measurable. 


c. Basic Properties of Area 


Let S and T be two bounded sets with S contained in T. A square 
Rj, that contains a point of S necessarily contains a point of T, so that 


A(S) < A,(T). 
For n — œ we find that generally 
(8) A*(S) < A(T) for SCT. 


Jn the particular case that A*(T) = 0, we conclude that also 
A (S) = 0. Hence: 


Any subset of a set of area 0 has area 0. 
For any two bounded sets S, T the totality of squares R% covering 
S and T also covers their union S U T. Hence 


A(S U T) < Ax(S) + AX(T). 
For n —> © we find that 


(9) A*(S U T) < A*(S) + A*(T). 
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More generally, for any finite number of sets S1, S2,. . ., Sv we have 
the finite subadditivity of outer areas expressed by the formula 


N N 
(10) A‘ Y Si) S 2, A(S). 


If in (10) all the S; have area 0 the same follows for the union: 


The union of any finite number of sets of area 0 has area 0. In partic- 
ular, any finite set of points has area 0. 


By definition, a set of area 0 can be covered by a finite number of 
squares Rj, of arbitrarily small total area A}. More generally, a set 
S has area 0 if for each £ > 0 we can find a finite number of sets Si, 

. ., Sy covering <S, the sum of whose outer areas is less than e2, for then 
by (8) and (9) the outer area of S is less than £, and hence, since £ is 
an arbitrary positive number, A*(S) = 0. 

For example, a continuous arc C in the plane given nonparametri- 

cally by an equation 


y = f(x) (asxsb) 


has area 0. For the proof we only have to use the fact that a con- 
tinuous function defined in a closed and bounded interval is uniformly 
continuous. For, given £ > 0, we can find an n so large that f differs 
by less than e for any two arguments in its domain that have distance 
< 2-7, We can find integers a, B such that 


a2" <a<(a+ 12”, B2™ <b < (B+ 12”. 


The portion of the graph of f(x) corresponding to values xwith i2” 
<x < (i + 1)2” is contained in a rectangle with sides that are paral- 
lel to the coordinate axes and have the lengths 2” and 2e. Hence, C 
is contained in the union of these rectangles with sides parallel to the 
axes of total area 


(B + 1 — a)2™” (22) < (b — a + 2) )2. 
For n — œ it follows that 
A*(C) < Xb — aje, 


and thus, since £ is an arbitrary positive number, that the arc C has 
area 0. 
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Most of the regions of practical interest have boundaries consist- 
ing of a finite number of continuous arcs of the form y = f(x) or x = 
g(y). Since the union of a finite number of sets of area 0 has itself area 
0, we conclude that such regions have a boundary of area 0 and, hence, 
are Jordan-measurable: 


Let the boundary of a set S be contained in the union of a finite num- 
ber of arcs, each of which is given either by an equation y = f(x) or by 
an equation x = g(y) with the respective function f or g defined and con- 
tinuous in a finite closed interval. Then S has an area. 

We now consider the union and intersection of S and T, where S 
and T are any two Jordan-measurable sets. A point that is interior to 
S or to Tis interior to S U T; a point exterior to S and to Tis exterior 
to S U T. Hence, a boundary point of S U T must be boundary point 
of either S or T. Similarly, boundary points of S N T must be bound- 
ary points of either S or of T. Hence, the boundaries of S U T and 
S N T lie in the union of 0S and dT and have area 0, since the bound- 
aries 0S and OT have area 0. This proves the fundamental fact: 


The union and intersection of two Jordan-measurable sets are again 
Jordan measurable. 
Applying (9), we conclude: 


If the sets S and T have an area, their union S U T also has an 
area and 


(11) A(S U T) < A(S) + A(T). 


Furthermore, if S and T do not overlap (i.e., interior points of either 
one of the sets are exterior to the other), we can even conclude that 


(12) A(S U T) = A(S) + A(T). 


For then a square R; cannot be contained in both S and T. Hence, for 
the nth subdivision 


AAS U T) 2 A,(S) + A,(T). 
For n — co it follows that 
A (SUT) z A(S)+ A(T). 


1More generally, it follows in the same way that a set S in n dimensions is Jordan- 
measurable if its boundary is contained in the union of a finite number of surfaces, 
each given by an equation of the form 


xy = f(X1, © © °, Xj-1, Xj+l, ° * * , Xn) 


with f continuous in a bounded closed set of x1 + « +xj-1 Xj+1 ° ° °%n-Space. 
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Since S, T and S U T are Jordan-measurable this implies that 
A(S U T) = A(S) + A(T), 


so that (12) follows from (11). 
This result can be extended immediately to any finite number of 
Jordan-measurable sets and constitutes the finite additivity of areas: 


If each of the finite number of sets Sı, . . ., Sy has an area and no two 
sets overlap, then the union S of Sı, . . ., Sn also has an area, and 
(13) A(S) = A(S1) + A(S2) + » » + + A(Sy). 


This addition theorem can be supplemented by a subtraction 
theorem. Given two sets S, T with S C T, we denote by T — S the set 
of points of T that are not contained in S. We shall prove that when 
S and T have areas and S C T, then T — S has an area and 


(14) A(T — S) = A(T) — A(S). 


It is easily seen again that the boundary of T — S is contained in 
the union of the boundaries of T and of S, so that T — S has an area. 
Moreover, S and T — S have no points in common hence do not over- 
lap, and have union T, so that by the additivity rule (12) 


A(T) = A(S) + A(T — S), 


which is equivalent to (14). 
A more symmetric combination of the addition and subtraction 
rules for areas consists in the identity 


(15) A(S N T) + A(S U T) = A(S) + A(T) 


valid for any two Jordan-measurable sets S and T. Indeed, we have the 
identity 


SUT-T=S-SQ\T 


between the four sets S, T, S N T, S U T. Since all four sets have an 
area, we can apply (14), and (15) follows. 

The preceding theorems permit us to free the notion of area from 
any reference to the special squares R; used in its definition. We shall 
see that area may be defined in terms of much more general methods of 
subdivision of the plane, including, for example, subdivisions of the 
plane into rectangles with sides parallel to the axes. 
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First, we observe that for a Jordan-measurable set S all points suf- 
ficiently close to the boundary 0S of S can be enclosed in a set of ar- 
bitrarily small area, for, since 0S has area 0, we can for a given £ > 0 
find an n = n(e) such that the set On of squars Rg having points in 
common with ðS has total area < £/9. Let P be a point of the plane 
that has distance < 2-7% from some point of dS. Then P either belongs 
to one of the squares in On or to one of the eight neighbors of such a 
square. The union of the set of all squares in ön and of their neighbors 
is then a set of area < e that contains all points of distance < 27% 
from the points of dS. 

Now take a subdivision >} of the whole plane into closed rectangles 
with sides parallel to the coordinate axes. The rectangles need not 
be congruent, but we require that the subdivision be so fine that all 
of the rectangles p have diameters! less than 2-77), We form the sum 
A;(S) of the areas of all rectangles p of our subdivision that are con- 
tained in S and also the sum A;(S) of all p that have points in common 
with S. Clearly, 


A;(S) S A(S) < A(S). 


Moreover, A(S) — A;(S) represents the sum of the areas of all 
rectangles p that contain both points in S and points notin S. These 
rectangles necessarily contain boundary points of S. Since their di- 
ameter is less than 2”, each point of such a rectangle p will have a 
distance less than 2 ” from some point of dS. Hence, the total area of 
these rectangles will be less than e. Thus, 


and consequently, 
A(S) — As(S)<«, A3(S) — A(S) < e. 


Taking a sequence of subdivisions >}, of the plane into rectangles 
with the largest diameter of any rectangle in >), tending to zero, we 
find that the corresponding sums A;(S) and A;(S) tend to the area 
A(S) of our set. 

The argument used applies equally well to sequences of much more 
general subdivisions >|, of the whole plane into sets p. We need re- 
quire only that the individual sets p be Jordan-measurable, closed, and 
connected and that the maximum diameter of any set p in a subdivi- 
sion tend to 0 as n > œ. 


1The diameter of a set is defined generally as the least upper bound (or, in the case of 
a closed and bounded set, as the maximum) of the distances of any two points in the 
set. In the case of a rectangle p this is the length of the diagonals. 
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A.2 Integrals of Functions of Several Variables 
a. Definition of the Integral of a Function f(x, y) 


We first define the integral of a function f(x, y) over the whole x, y- 
plane. Throughout this section we make the assumption that the func- 
tion f(x, y) is defined for all (x, y) but has the value 0 outside some 
bounded set, that is that f(x, y) = 0 for all (x, y) sufficiently far away 
from the origin (such functions are said to have compact support). 
Moreover, we assume that f is bounded. 

In defining the integral of such a function f we make use of the same 
kind of subdivision of the plane into closed squares R,% as in the case 
of areas. Let Mj be the supremum and m;ų the infimum! of f in the 
square Rj. We then associate with f and the nth subdivision of the 
plane the upper sum 


F} = 5 MẸ” 
iLk 
and the lower sum? 
Fn = 2 Mik Qn 
t, 


Only a finite number of terms in these sums are different from 0, since 
f = 0 for distant points. Since mj; < Mj, we have 


(16) F, < F}. 


In passing from the nth to the (n + 1)-st subdivision, each square 
Rj is divided into four squares R"; of area 2°" for which, 


obviously, 


n n+1 n+l n 
Mik < Mijs < M;; < Mx. 


It follows that 
(17) F, S Fri S Fri S Fy. 


Since bounded monotone sequences converge (see Volume I, p. 96), 
the upper and lower sums have limits 


1See the definitions in Volume I, p. 97 

2The factor 2-2" represents the area of the squares R# produced in the nth subdivi- 
sion. In three dimensions, where we subdivide space into cubes of side 2", the factor 
becomes 2-8" and, similarly, in k dimensions, 2--*, 
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(18) F` = lim F, F* = lim FF, 


nc n— oo 
where, of course, 
(19) F< F. 


We call F* the upper integraland F~ the lower integral of the function 
f(x, y). 


DEFINITION. The function f(x, y) is called integrable! if its upper 
integral F* and its lower integral F” have the same value, which is then 
called the integral of f and is denoted by 


{J fax dy. 


Since 


Ft — F` = lim (Ft — F>), 


no 


we immediately have the following integrability condition: Necessary 
and sufficient for the integrability of f is that 


(20) lim (Fy — F3) = lim X (Mii — mio" = 0. 


We can associate with the nth subdivision a Riemann sum 
Fa = D felo 12”, 


where (Eik, niw) is an arbitrary point of the square Rọ. Clearly, 
(21) F, < Fn < F}. 


We conclude from (18): 

If f is integrable, the Riemann sums Fn converge to the value of 
ff f dx dy irrespective of the choice of the intermediate points (Ej, nig) 
in tke 
1More precisely, ‘‘Riemann-integrable.” The definition given here differs from the 


common one in so far as only the restricted class of subdivisions into squares R% is 
considered, but is equivalent to it. 
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b. Integrability of Continuous Functions and Integrals over Sets 


For applications of the notion of integral the following theorem is 
basic: | 


A continuous function f vanishing outside some bounded set S is in- 
tegrable. 
For the proof we can assume that S is a square 


IxJSN, lylSN, 


where N is a positive integer. Then in the nth subdivision Mj = mi 
= 0 for Rj not contained in S. In the closed bounded set S the con- 
tinuous function f is uniformly continuous. Consequently, given € > 0, 
there exists a 5 > 0 such that the values of f differ by less than € 
for any two points in S having distance less than 5. Hence, 


Mik — mz S £, 
provided n is so large that 
V22"< 6, 
Thus, 
F,— F, Sez”, 


where the summation is extended over all i, k for which the square 
R,, is contained in S. Since the sum of the areas of those squares 
equals the area 4N? of S, it follows that 


Fi — F; S4N’e 


for all sufficiently large n and, hence, that f satisfies the integrability 
condition (20). 

The continuous functions are not the only integrable ones. We 
shall not try to determine the most general integrable functions. 
However, we do consider one important class of discontinuous func- 
tions that are integrable, namely, the characteristic functions of 
bounded Jordan-measurable sets. With any set S in the plane we as- 
sociate the characteristic function ¢s defined by 


1 for (x,yES 


fsla, y) = 0 for (x,y) £ S. 
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The points where ¢s is discontinuous are exactly the boundary points 
of S. 

We take now a bounded set S and investigate the integrability of 
the function ds(x, y). The boundedness of S implies that ¢s vanishes 
outside some bounded set. Obviously, for this function Mj, = 1 for 
all squares Ry, having points in common with S, and Mj; = 0 for the 
others. Hence, the upper sum F% is just the sum A;(S) of the areas of 
all squares R; that have points in common with S. Thus, for the func- 
tion gs the upper integral F* = lim F; is identical with the outer area 

n-o 


A*(S). Similarly, F,, equals the total area A, (S) of the squares Rf, 
contained in S, so that the lower integral F` is the inner area A (S). 
Hence, integrability of gs is equivalent with A*(S) = A (S), that is, 
with Jordan-measurability of S. When ¢s is integrable, the value F 
of its integral is, of course, the area A(S). We have proved: 


The sets S whose characteristic function ġs is integrable are exactly 
those that have an area. The integral of ¢s is the area of S: 


f | $s dz dy = A(S). 


From continuous functions and characteristic functions of Jordan- 
measurable sets, we can construct other integrable functions by ap- 
plying the rule: 


The product of two integrable functions is integrable. 
Let f and g be integrable, which for us implies that they are bound- 


ed and vanish outside some bounded set. Let Mig, My, M’%;, denote 


the supremum and Miz, Mk, M" the infimum of the three functions 


fg, f, g in the square Rj. For any two points (&’, n’), (6, n”), we have 


FE, ng, 1) — FE”, 186", n”) 
= fE, nla’, n) — a", 1) + 8S", WOE, 0’) — FE", 0). 
Hence, denoting by N an upper bound for |f| and |g]: 
Mi, — my, S N(M"G, — mi) + NMG, — my). 


It follows immediately that fg satisfies the integrability condition (20) 
if it is satisfied by f and by g. 

Given a function f(x, y) and a set S in the y, z-plane, we say that f 
is integrable over the set S if the function fs is integrable in the sense 
used before; we then define the integral of f over S by 


(22) f T f dx dy = Í ji fós dx dy. 
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We have from our product theorem: 
An integrable function f is integrable over every Jordan-measurable 
set S. In particular, every continuous function of compact support is in- 


tegrable over Jordan-measurable sets. 
If f is integrable over the set S, the value of the integral 


ff. f dx dy 


does not depend on the values of f at points not in S, since the function 
f¢és is determined by the values of f in the points of S. It is not even nec- 
essary to have f defined everywhere. As long as S belongs to the do- 
main of a function f, we can define fds to be equal to f at the points of 
S and 0 everywhere else. 

For any integrable f(x, y), we can always interpret 


Í f dx dy 


as 


ff. fax dy, 


where S is some sufficiently large square outside of which f vanishes. 


c. Basic Rules for Multiple Integrals 


We saw already that the product of two integrable functions f and 
g is again integrable. Even more trivial is the fact that f + g also is 
integrable; this follows from the integrability condition (20) and the 
observation that for any set 


sup(f + g) — inf(f + g) < (supf — inf f) + (supg — inf g). 


The representation of integrals as limits of Riemann sums then shows 
that 


(23) ff ¢+ g)dx dy = |f fdxdy+ [f gdxdy. 


An estimate analogous to the mean value theorem of integral cal- 
culus for functions of a single variable is basic for all work with 
integrals. Let S be a Jordan measurable set and f an integrable func- 
tion. Let M be an upper bound and m a lower bound for f in S. We can 
approximate the integral of fgs by Riemann sums 
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Fa = Sf Gli, nisl, NDZ, 


where we take care to choose for (Eik, nių) a point of S if the square 
Rj, contains such a point. Thus, 


Fa = >) (Eh, 12 


where the sum is extended over all i, k for which Rj, has points in com- 
mon with S. Since m < f <£ Min S, we find that 


mA (S) < Fn < MA*(S). 
For n— œ it follows that 
mA‘(S) < F < MA‘*(S); 


since, by assumption, S has an area, we conclude that the inequality 
(24) mA(S) < {| fds dy < MA(S) 


holds. 
Let S’ and S” be Jordan-measurable sets that do not overlap (that 


is, interior points of one are exterior to the other); let S be their union 
and s their intersection. The characteristic functions of these sets 
satisfy the relation 


ds + bs = Os: + Osu. 


Hence, for any integrable function f we find, on applying (23), the re- 
lation 


ff fés dx dy + |] fés dx dy = f| fós dx dy + || fgs” dx dy; 


that is, 


ff, faxdy + [| faxdy= f|., fdzdy + ff., fdsdy. 


Here, by assumption, s contains only boundary points of S’ and of 
S”. Thus, A(s) = 0, and, hence, by (24), also 


f| fx dy = 0. 


This proves the law of additivity for integrals: 
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If the sets S' and S” have areas and do not overlap and if f is 
integrable, the relation 


(25) ffo cnf x ey = [| fdxdy + |] fide dy 
holds. 

More generally, if S is the union of the Jordan-measurable sets 
Si, . . ., Sv, no two of which overlap, and if f is integrable, we have 
N 

(26) We fax dy =}; ffy, fax dy. 


This rule opens up the possibility of approximating integrals over a 
set S by Riemann sums based on much more general subdivisions than 
the ones we have considered so far. Assume, for simplicity, that S is 
a closed Jordan-measurable set and f a function continuous in S. 
A “general subdivision” >| of S shall mean a representation of S 
as the union of the Jordan-measurable sets Si,..., Sw, no two of 
which overlap. In each S; we pick an arbitrary point (&:, ni) and form 
the generalized Riemann sum 


(21) F; = © f&n WA(S). 


We shall prove that F tends to the integral of f over the set S as the 
subdivision is refined indefinitely. The continuous function f is uni- 
formly continuous in the bounded closed set S. Given an £ > 0, we can 
find a 5 > 0 such that f varies by less than £ between any two points 
of S having distance less than 5. Assume that the subdivision >) is 
so fine that all the S; have diameter < ô, that is, that any two points 
in the same S; have distance less than 6. Then, 


f(E ni) — € S FE, n) SAE, ni) + e 
for all (£, n) in S;. It follows from (24) that 


[fén n) — EACS) < ff. AG, nde dy < fés no) + e1A(S%). 
Hence, by (26), (27), (18), 
F, — eA(S) < Í i f dx dy < F; + &A(S). 
It follows that the generalized Riemann sums Fy, differ arbitrarily 


little from the value of the integral of f over S, for all sufficiently fine 
subdivisions J.. 
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d. Reduction of Multiple Integrals to Repeated Single Integrals 


The computation of the value of a triple integral can usually be 
reduced to the evaluation of single and double integrals—and, similar- 
ly, that of double integrals to single integrals and generally that of an 
integral in n-space to integrals in (n — 1)-space—by use of the follow- 
ing theorem: 


Let f(x, y, z) be an integrable function defined in x, y, z-space. As- 
sume that for any fixed values of x, y we have in f(x, y, z) a function of 
the single variable z that is integrable,' and let 


(28) | Ræ, y, 2)dz = h(x, y). 
Then h(x, y) as function of x, y is integrable and 
(29) f i ji f(x, y, z) dx dy dz = Í h(x, y) dx dy. 


For the proof we consider the nth subdivision of x, y, z-space into 
cubes C7, given by 


where M,;;,, is the supremum of f(x, y, z) in ik» and, similarly, form 
the lower sum F,,. We now take any fixed point (x, y) in the square 
Rij 


Rij: on 


Then My is an upper bound for f(x, y, z) as a function of z in the in- 
terval 


1Here, of course, single integrals are taken in the same sense as double integrals; 
they are defined with the help of the special subdivisions on the line into intervals 
i2-" < z < (i + 1)2-", taking lower and upper sums, and so on. 
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It follows from (24) and (26) that for x, y, © Rij 
h(x, y) = | f(x,y, 2) dz 
=> faf y, 2) dz S Y Mp2”. * 


Denote by H} and H}; the upper and lower sums for the integral of 
h(x, y) in the nth subdivision. It follows that 


Hi < DE Mh 2 = Fi, 
tj 
and similarly, 
H, Z F,- 
Since 
lim Ff = lim F} = [f| f(x,y, 2) dx dy dz, 
it follows that h(x, y) is integrable and that (29) holds. 
Under appropriate assumptions we can further reduce the double 
integral 
Í h(x, y) dx dy 
to a repeated single integral 
f g(x) dx, 
where for each fixed x the function g(x) is defined by 
g(x) = | h(x, y) dy 
To apply this reduction we only have to know that for each fixed x 
we have in A(x, y) an integrable function of y. This follows, however, 
from the two-dimensional analogue of formula (29) if we make the 


1Implicit in our assumptions is, of course, that f vanishes outside some bounded 
region, so that only a finite number of the intervals I? are involved. 
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additional assumption that f(x, y, z) for any fixed x is an integrable 
function in the y,z-plane, so that 


{J fy, 2) dx dy = | h(x, y) dy = g(a). 


Hence, we can evaluate the original triple integral by repeated single 
integrations: 


(30) [Sf AE, y, 2) dx dy dz = f | f | f(x, y, 2) dz |do} dx. 


A simple application, familiar from elementary calculus, is pro- 
vided by the formula for the reduction of a volume integral over a 
cylindrical region to a double integral. 

Assume that S, a closed set in the x, y-plane, has an area and that 
a(x, y), B(x, y) are continuous functions defined in S with a(x, y) < 
B(x, y). Let C denote the cylindrical region 


C: (x,y) E S a(x, y) S z < B(x, y). 


The boundary of C consists of the surfaces z = a(x,y), and z = 
B(x, y), which, by p. 521, have volume 0, and of the points in C for which 
(x, y) lies on the boundary S» of S. Since S» has area 0, this latter set 
also has volume 0. This shows that C is Jordan-measurable. Now let 
f(x, y, z) be a continuous function defined in C. Then f(x, y, z)dc(x, y, 2) 
is integrable and 


We f dx dy dz = {if f(x, y, Z)bo(x, y, z) dx dy dz 


exists. Now for any fixed (x, y) E S the expression f(x, y, z)dc(x, y, 2) 
vanishes outside the interval 


a(x, y) S z P(x, y) 


(which might shrink to a point) and is continuous in the interval. 
Hence, f(x, y, z)éc(x, y, z) is integrable and has the integral 


B(T, y) 


h(x, y) = ILE Y, Z)bc(X, y, z) dz = f a,n [œ 2) dz, 


where we have made use of the ordinary notation for definite integrals 
over intervals. For (x, y) £ S we have f(x, y, z)¢c(x, y, z) = 0 for all z. 
Hence, for any (x, y) 
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B(z,y) 


h(x, y) = bs(x,y) fip F&I 2) dx dy. 


ala y 


Consequently, in this case, the identity (29) yields 


(31) Í il i f(x, y, 2) dx dy dz = ff. | f Pew) f(x, y, 2) dz dx dy. 


a(z,y) 
A.3 Transformation of Areas and Integrals 


a. Mappings of Sets 


Our aim will be to derive the rule by which a multiple integral is 
transformed when we change the variables of integration. Such a 
change of the independent variables x, y in the plane is a mapping 
T of the form 


(32) E = f(x, y), n= g(x, y), 


where f and g are defined in a set Q, the domain of the mapping. (Simi- 
lar mappings define a change of variable in higher dimensions.) Each 
point (x, y) in Q has a unique image (¢, n). The images form the range 
@ = T(Q) of the mapping T (see p. 242). More generally, for any subset 
S of Q we denote by T(S) the set consisting of the images of all the 
points of S. 

For the mappings T considered here, we make the following as- 
sumptions: 

1. The domain Q of T is an open bounded set in the x, y-plane. 

2. The mapping functions f, g are continuous and have continuous 

first derivatives: fz, fy, Zz, Zy in Q. 
3. The Jacobian A of the mapping does not vanish in Q: 


f£ fy 
gz gy 


dé, n) _ 


= d(x, y) = = fagy = fugx # 0. 


(33) A 


4. The mapping is 1-1; that is, each point (€, n) in œ isthe image of 
a single point (x, y) of Q. 

Formula (33) has the important consequence (see p. 261) that for 
every &-neighborhood N: of a point (xo, yo) of Q there exists a 6-neigh- 
borhood of the image point (0, no) contained in T(N-). This implies 
that for any subset S of Q an interior point of S is mapped into an 
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interior point of T(S). Thus, open sets S are mapped onto open sets 
T(S).1 In particular, the range œ of our mapping is open. 

Condition 4 states that there exists an inverse mapping T-t, which 
associates with every (é, n) in œ the unique (x, y)in Q that is mapped 
by T onto (€, n). The inverse mapping is given by functions 


x= alé, n), y = BCE, n) 


defined in the open set œ, which.are continuous and have continuous 
first derivatives 


Qe = gy/A, An = —fylA, Be = —gz/A, By = fx/A 


(see p. 261). The Jacobian of the inverse mapping is 


Qe An 


Be Ba 


1 


Ux, y) _ = AgBr — onbe = A 


dlé, n) 


and, of course, is also different from zero. 

Hence, in short, the inverse mapping T-thas all the properties we 
postulated for T. 

In order to arrive at the area of the image of a set S, we first consider 
a closed square Rj, contained in Q and estimate the area of T(R}. 
We assume that we are given an upper bound p for | fel, | fyl, | gz],] gy] 
andan upper bound M for |A| in Rj. We assume also that we have an 
upper bound € for the amount by which any of the quantities fz, fy, 
Zx, Zy varies in Ri. Introducing the abbreviations x; = i2~”, yp = k27” 
for the coordinates of the lower left-hand corner of Rj, we can 
approximate f and g in Rọ by the linear functions 


fal, y) = fxi, yk) + fali, ye — xi) + folxe, Yey — yr) 
BilX, Y) = B(xXt, Ye) + Bali, yu(x — xi) + y(x, Yey — yi). 


By the mean value theorem of differential calculus (see p. 67), we 
have for every (x, y) in Rj, 


f(x, y) = f(x, yu) + fx’, yx — xi) + f(x’, YY — yx) 
glx, y) = g(x yr) + Bxlx", y Nx — xi) + glx", yy — Yr), 


where (x', y’) and (x”, y”) are suitable intermediate points on the 
line joining (x, y) and (xi, yx). It follows that for any (x, y) in Ri, 


1We say that T is an open mapping. 
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= |[fe(x’, y) — falx yu) (x — x) 
+ [fy(x’, y) — fuli, yey — ye)| S 2827”, 


and similarly, 
|a(x, y) — Bix, y) |S 262-. 
Now, the linear mapping 
(34) 6 = falx, yY), N = gxlx, Y) 
takes the square R}, into the parallelogram n; with vertices 


(f, 8), (f + 2 "f x, g + 270gr), (f+ 2-fy, g + 2-"gy), 
(f+ 2°-fc + 2-"fy, g + 2 "gc + 2-"gy), 


where f, g, fz, fy, Zz, 8y are to be taken at the point (x:, yx). The area 
of this parallelogram is the absolute value of the determinant (p.195) 


2-"fz 2-"fy — 9-2nA_ 


2- "8% 2 "By 


The coordinates (E,n) of any point of T(Rj,) differ at most by 2e2-” from 
the corresponding coordinates of a point in 1, obtained by the linear 
mapping. Hence, every point in T(Rj,) either lies in tj, or ata distance 
at most 23/2e2-” from one of the sides of 1;,. Each side of 77, has length 
at most 72 2-"1. The set of points lying within the distance 2°/%«2-" 
from one side has an area at most 


(442 2-e)(V2 2-1) + n(2QV2 2-"e)2 = Be(ne + p)2-2", 


Since the area of 74, does not exceed M2-2", we find that T(R) is 
contained in a set whose area is at most 


(35) (M + 32ne2 + 32pue)2-2". 


Take now any square Riri arising in the Nth subdivision contained 
in Q. In the closed set Rj the quantities|fz|,|fyl,|&zl,|&ylhave a 
common upper bound p. Since fz, gz, fy, gy are uniformly continuous 
in Ri, we can find a finer subdivision into squares RF, such that these 
functions vary by less than £ in each square Rj, C RŠ, If M,;, denotes 
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the supremum of |A| in Rj, we find from (35) that T(R; N) is covered by 
sets of total area at most 


YS (M8 + 32ne? + 32pe)2”” = F} + (82me? + 32u2)2°2%, 
Rik Rir 


where F, is the upper sum corresponding to the nth subdivision for 


the integral 
Í ji vy |Aldx dy. 
jr 


For n > œ the upper sums F} tend to the value of the integral, since 
the function |A| is continuous and, thus, integrable over R}. Since 
€ is an arbitrary positive number we find [see (8), (10), p. 519,520] that 
the outer area of the image of the square Ry. satisfies the inequality 


(36) ATRIIS fay |Aldx dy, 


which represents the first step in our computation of the area of image 
sets. 

Now take any Jordan-measurable set S, which together with its 
boundary ðS lies in the open set Q. We can find a closed set S’ C Q 
and an N such that for n > N any square Rj, of side 2” that has 
points in common with S lies completely in S’.1 
For n > N, let the union of the squares Rj, having points in common 
with S be denoted by Sn. The image of Sn is covered by the images of 
those squares. Hence, (86) yields the estimate for the outer area of 
T(S) 


A*[T(S)] < A*IT(Ss)] S Fa AIT (Ri) 


RijC Sn 


S te n |A|dx dy = |A| dx dy. 

= Se Io, 

For n —> œ the intogral of |A|over Sn tends to the integral over S, 
since|A|is bounded in S’ and the total area of the Rj, that have points 
in common with S without lying completely in S tends to 0 for the 
Jordan-measurable set S. Thus, we have proved that 


(37) A‘IT(S)] < f|, |Aldx dy 


1We only have to choose for S’ the union of all RY having points in common with 
S, where we take N sufficiently large. 
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for any Jordan-measurable set whose closure lies in Q. 
Under the same assumptions on S, we can also apply (37) to the 
boundary 0S of S which is a closed subset of Q of area 0. Then, by (37), 


+ — 
A*IT@S)] <f. |A|dx dy < (Max|A|)A@S) = 0. 


Hence T(0S) has area 0. Let (£, n) be a boundary point of T(S) and 
consider a sequence of points (En, Nn) in T(S) with the limit (£, n). The 
(En, Nn) areimagesof points (xn, yn) in S. A subsequence of the (xn, yn) 
converges to a point (x, y) in the closure of S and, hence, inQ. The con- 
tinuity of the mapping T implies that (E, n) is the image of (x, y). Here 
(x, y) cannot be an interior point of S, since then (E, n) would have to 
be an interior point of T(S) and not a boundary point. Hence, (x, y) 
is a boundary point of S. Thus, the boundary of T(S) consists of images 
of boundary points of S, and, hence, is a subset of the set T(0S) that 
has been shown to have area 0. Thus, the boundary of T(S) also has 
area 0,.and we have proved that T(S) is Jordan-measurable. We can 
then replace A*[7(S)] in (37) by the area A[T(S)] and find that A[7(S)] 
exists and satisfies 


(38) AIT(S)| s {ff 14\ dx dy= ff eR] dx dy 


for any Jordan-measurable set S whose closure lies in Q. 

We saw that the boundary of 7(S) is contained in T(3S) and, hence, 
in œ. Thus, T(S) is a Jordan-measurable set whose closure lies in œ = 
T(Q). Since T and T- have the same properties we can apply formula 
(38) to the inverse mapping and find that also 


d(x, y) | 
dé, n) | dE dn = Jro 


If we apply this last formula to a square Rj, contained in Q, we find 
that 


(39) A(S) s ffas 


K | a dn, 


—2n _ n 1 1 n 
zm = ARD < Sf zy a Idn < za ATR 


where m%, is the greatest lower bound of|A|in Rz. Thus, 


A[T(Ri,)] = mp”. 
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For any Jordan-measurable set S with closure in Q, let the union of 
the Rj, C S be denoted by Sn. Then 


A[T(S)] = A[T(Sa)] = So AIT(RR) 2 To mi, 2" = Fy, 


Rips Rips 


where F, is the lower sum for the integral of |A| over the set S. For 
n —> œ we conclude that 


A[T(S)] = ff, 1Aldx dy. 


Combined with (38) we have thus proved the fundamental fact: 


Let S be a Jordan-measurable set whose closure lies in the domain 
Q of the mapping T. Then the image T(S) also has an area and this area 
is given by the formula 


(40) A[T(S)] = Í J, „d= Í J. Peele dy. 


6. Transformation of Multiple Integrals 


It is easy to pass from formula (40), which represents the law of 
transformation of areas, to the more general formula for transforma- 
tion of integrals. We make the same assumptions on the mapping T 
as before. Now let S be a closed Jordan-measurable set contained in 
Q and let F(x, y) be a function that is defined and continuous for (x, y) 
in S. Since the inverse mapping x = a(&, n), y = B(E, n) is continuous in 
Q, the function F(a(E, n), B(E, n)) is defined and continuous in the set 
T(S). We again denote this function of € and n by the letter F. The law 
of transformation for integrals then takes the form 


(41) ff fo Fd dn = ff F a Y dx dy. 


For the proof, we use the representation of integrals of continuous 
functions by generalized Riemann sums (see p. 530). We consider a 
general subdivision of S: 


S= Ü Sı 
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where the S; are closed Jordan-measurable subsets of S that do not 
overlap. The image sets T(S;) furnish a corresponding subdivision of 
the set T(S). Since the mapping T is uniformly continuous in the 
closed set S, the diameters of the image sets T(.S;) tend to 0 when those 
of the S; do. Take a subdivision so fine that f varies by less than e€ 
in each Sı. Let (xi, yi) be a point in S;. Then F(x, yi) is also one of the 
values taken by the function F(a(é, n), R(E, n)) in the set T(S;). We form 
the Riemann sum corresponding to the left-hand integral in (41): 


Z F(x, AITSA = X [|y Fee y) |A(e, 9) dx dy 
= X |f, F@ 9) IAG, »)|dx dy +r 
= {J F(x, ») 1A, ») dx dy + r, 
where 
rl=13 fy, (Fe y) — Fle, MAG, ») dx dy! 
Se X ff, 1A, 3) ldx dy = cA[T(S)]. 


As the subdivision becomes finer, the Riemann sum tends to the inte- 
gral of F over the set T(S). For £ > 0 we obtain the identity (41). 


A.4 Note on the Definition of the Area of a Curved Surface 


In Section 4.8 (p. 423) we defined the area of a curved surface in a way 
somewhat dissimilar to that in which we defined the length of arc in 
Volume I (p. 348). In the definition of length, we started with inscribed 
polygons, while in the definition of area we used tangent planes in- 
stead of inscribed polyhedra. 

In order to see why we cannot use inscribed polyhedra, we consider 
that part of the cylinder with the equation x? + y? = 1in x, y, z-space, 
which lies between the planes z = 0 and z = 1. The area of this cyl- 
indrical surface is 2x. In it we now inscribe a polyhedral surface, all 
of whose faces are identical triangles, as follows: We first subdivide 
the circumference of the unit circle into n equal parts, and on the 
cylinder we consider the m equidistant horizontal circles z = 0, z = h, 
z = 2h,. ..,z = (m — 1)h, where h = 1/m. We subdivide each of these 
circles into n equal parts in such a way that the points of division of 
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each circle lie above the centers of the arcs of the preceding circle. 
We now consider a polyhedron inscribed in the cylinder whose edges 
consist of the chords of the circles and of the lines joining neighboring 
points of division of neighboring circles. The faces of this polyhedron 
are congruent isosceles triangles, and if n and m are chosen sufficient- 
ly large, this polyhedron will lie as close as we please to the cylindri- 
cal surface. If we now keep n fixed, we can choose m so large that each 
of the triangles is as nearly parallel as we please to the x, y-plane and 
therefore makes an arbitrarily steep angle with the surface of the cyl- 
inder. Then we can no longer expect that the sum of the areas of the 
triangles will be an approximation to the area of the cylinder. In fact, 
the bases of the individual triangles have the length 2 sin n/n, and the 
altitude, by the Pythagorean theorem, the length 


1 my? _ fle E LA 
mm + (1 — cos =| =N m? + 4sin 3n 

Since the number of triangles is obviously 2mn, the surface area of 
the polyhedron is 


Fn,m = 2mnsin = |, + 4 sint 5 = ansin™ ji + 4m? sint >. 
The limit of this expression is not independent of the way in which 
m and n tend to infinity. If, for example we keep n fixed and let m > ©, 
the expression increases beyond all bounds. If, however, we make 
m andn tend to œ together putting m = n, the expression tends to 27. 
If we put m = n?, we obtain the limit 


2nV/1 + 14/4, 


and so on. From the above expression F'n,m for the area of the polyhed- 
ron we see that the lower limit (lower point of accumulation) of the set 
of numbers F'n,m is 2x, where m tends to infinity with n in any manner 
whatsoever.! This follows at once from Fram Z 2n sin n/n and 
lim 2n sin n/n = 2r. 


næ 


1The lower limit L of a bounded sequence F», (denoted by L = lim inf F'n) can be defined 
No 


in several equivalent ways: 
a) Lis the greatest lower bound of the limits of all convergent subsequences of the 
Fn. 
b) Lis the limit for N > œ of the greatest lower bounds of the sets obtained from 
the Fn by omitting the first N terms. 
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In conclusion we mention—without proof—a theoretically interest- 
ing fact of which the example just given is a particular instance. If 
we have any arbitrary sequence of polyhedra tending to a given 
surface, we have seen that the areas of the polyhedra need not tend to 
the area of the surface. But the limit of the areas of the polyhedra 
(if it exists) or, more generally, any point of accumulation of the 
values of these areas is always greater than, or at least equal to, the 
area of the curved surface. If for every sequence of such polyhedral 
surfaces we find the lower limit of the area, these numbers form a 
definite set of numbers associated with the curved surface. The area 
of the surface can be defined as the greatest lower bound of this set of 
numbers. 


c) Lis the lower point of accumulation (see Volume I. p. 95) of the Fn that is Lis the 
smallest number with the property that every neighborhood of L contains points 
Fn for infinitly many n. 
d) For every positive e we have Fn < L — e for at most a finite number ofn, and 
Fn < L + e for infinitly many n. 
The upper limit M = lim sup Fn of the sequence Fnis defined analogously. The se- 
n-o 


quence converges if and only if L = M. 
1This remarkable property of the area is called semicontinuity or, more precisely, 
lower semicontinuity. 


CHAPTER 
D 


Relations Between Surface 
and Volume Integrals 


The multiple integrals discussed in the previous chapter are not the 
only possible extension of the concept of integral to more than one 
independent variable. Other generalizations arise from the fact that 
regions of several dimensions may contain manifolds of fewer dimen- 
sions and that we can consider integrals over such manifolds. Thus, 
for two independent variables, we considered not only the integrals 
over two-dimensional regions but also integrals along curves, which 
are one-dimensional manifolds. With three independent variables, 
besides integrals over three-dimensional regions and integrals along 
curves, we encounter integrals over curved surfaces. In the present 
chapter we shall introduce surface integrals and discuss the mutual 
relations between integrals over manifolds of varying dimensions.! 


5.1 Connection Between Line Integrals and Double Integrals 
in the Plane (The Integral Theorems of Gauss, Stokes, and 
Green) 


For functions of a single independent variable the fundamental 


1We use the term manifold without precise definition as a generic name for sets of 
an unspecified number of dimensions. In this book we deal exclusively with manifolds 
that are subsets of some euclidean space, such as the curves, two-dimensional sur- 
faces, hypersurfaces, and four-dimensional regions in four-dimensional euclidean 
space. More generally, manifolds can be defined without reference to a surrounding 
euclidean space. Such manifolds locally resemble deformed portions of euclidean 
space, while their over-all structure can be much more complicated than that of 
euclidean space. i 
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formula stating the relation between differentiation and integration 
(cf. Volume I, p. 190) is 


(1) JP f(a) dx = flea) — flea). 


An analogous formula—Gauss’s theorem, also called the divergence 
theorem—holds in two dimensions. Here again, the integral of a 
derivative of functions 


Í ii falx, y) dx dy or Í i g(x,y) dx dy 


is transformed into an expression that depends on the values of the 
functions themselves on the boundary. We regard here the boundary 
C of the set R as an oriented curve + C, choosing as positive sense on 
C the one for which the region R remains on the “‘left’’ side as we de- 
scribe the boundary curve C.! Gauss’s theorem then states that 


(2) Sf IAE, + exo dx dy = J Ufley) dy — a(e,y) de 


This theorem contains as a special case our previous formula ex- 
pressing the area A of the set R as a line integral over the boundary C 
of R. We put f(x, y) = x, g(x, y) = 0 and at once obtain 


a= ff dx dy = | x dy. 


In exactly the same way, for f(x, y) = 0 and g(x, y) = y, we obtain 


a= f dx dy = — f, y dx 


in agreement with Volume I (p. 367). 

The divergence theorem becomes particularly suggestive in the no- 
tation of the calculus of differential forms, as explained on pp. 307-324. 
In (2), the line integral has the integrand 


L = f(x,y) dy — g(x,y) dx, 


a first-order differential form. Indeed, L can be identified with the most 
general first-order form a(x, y)dx + b(x, y)dy if we take f = b, g = —a. 
By the definition on p. 313 the derivative of this form is 


1Assuming that the x, y-coordinate system is right-handed. 


Relations Between Surface and Volume Integrals 545 
dL = df dy — dg dx = (fz dx + fy dy) dy — (gz dx + gy dy) dx 
= fs dx dy — gy dy dx = (fz + &y) dx dy, 


which is just the integrand of the double integral in (2). Hence, for- 
mula (2) takes the form! 


(2a) f| a= L. 


In the proof we restrict ourselves to the case in which È is an open 
set whose boundary C is a simple closed curve consisting of a finite 
number of smooth arcs; moreover, we assume that every parallel to one 
of the coordinate axes intersects C in at most two points.! We require 
f and g to be continuous and to have continuous first derivatives in 
the closure of R (consisting of R and of its boundary C). 

We first assume that the function g vanishes identically. Then the 
double integral of fz over R exists and can be written as a repeated 
integral? 


(3) Í f, f(x,y) dx dy = J dy J fax, y) dx. 


On each parallel to the x-axis, the variable y is constant. The paral- 
lels to the x-axis intersecting R correspond to y-values forming an 
open interval no < y < nı, the projection of R onto the y-axis.’ For 


1The process of forming the boundary of a set R presents formal analogies with differ- 
entiation. For that reason one frequently uses the symbol 3R for the boundary +C 
of R, writing (2a) as 


(2b) J) R dL =f OR L. 


This formula actually applies much more generally to differential forms integrated 
over manifolds in n-dimensional space (see p. 624). 

1In the Appendix the theorem (and its generalizations in higher dimensions) is 
proved under the assumption that R is the closure of an open set bounded by a simple 
curve that is smooth everywhere. 

2The set R is bounded by the union ofa finite number of smooth arcs and, hence, (see 
p. 521) is Jordan-measurable. The integral of the continuous function fz over R exists 
then and is defined as the integral of grfz over the whole plane,where gz is the char- 
acteristic function of the set R (that is, gr is 1 in the points of R but is 0 in all other 
points). The reduction of the double integral to a repeated integral is permitted (see 
p. 531) since the function ørfz can be integrated over each parallel to the x-axis; 
indeed, each parallel to the x-axis meets R in either an open interval or nowhere, so 
that the integral of grfz over a parallel to the x-axis is either the integral of the con- 
tinuous function fz over an open interval or zero. 

8The projection of R is an open interval because R is open and its boundary is a 
simple closed curve and, hence, connected. 
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each y in that interval the corresponding parallel to the x-axis cuts 
out of R an interval xo(y) < x < xi(y) whose end points are the ab- 
scissas of the two points of intersection of the parallel with C (see 
Fig. 5.1). Formula (8) asserts more precisely that 


Figure 5.1 


[J fe dx dy = J hO) dy, 


where 

ny) = J" fale, 9) de = flea(y), 9) — feal), 9) 
Hence, 
(4) I fz dx dy = J fa), y) dy — J flo), y) dy. 


We introduce the two simple oriented arcs + C1, + Co given parametri- 
cally, respectively, by 


+O:x= xt, y =t, for nstsm 
+Co: x = xot, y =t, for nost<n, 


where in each case the sense of increasing ¢ corresponds to the 
orientation of the arc. Formula (4) can then be written as 


[J fededy=J' fay—J fay. 
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Now Cı and Co form respectively the right and left portions of C, 
where, however, + Ci has the same orientation as C and + Co the op- 
posite one. Denoting by — Co the arc obtained by reversing the orienta- 
tion of Co, we obtain (see p. 94) 


[J faxdy=J fdy+ J fay=J fay. 


We can similarly decompose +C into an “upper” arc 
+ Ti: x =t, y = y(t), for bo St S&a 

and “lower” arc 
+roix=t, y=y(t) for &StS&, 


oriented according to the sense of increasing t. Here the interval 
So < x < &1 represents the projection of R onto the x-axis. Then, 


ff gy dx dy = J" dxf"! ay dy 


Yg (1) 


= fale, nade — f” 8, yola) dz 
50 50 


J, g dx — J 8 dx 


-fJ gds- gds 


-Jst 


since here To has the same orientation as C and T: the opposite one. 
Adding the two identities obtained, we arrive at the general formula 
(2). 

We can now extend our formula to more general open sets R 
bounded by a simple closed curve C, provided C can be decomposed 
into a finite number of simple arcs Ci, . . ., Cn each of which is inter- 
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sected in at most one point by any parallel to one of the coordinate 
axes.! In order to prove that here also 


(5) [J fedxdy= Jf fay, 


we draw parallels to the y-axis through all of the end points of the 
simple arcs C; (see Fig. 5.2). In this way R is decomposed into a finite 


Figure 5.2 


number of sets Ri, ..., Rv each of which is bounded laterally by 
straight segments parallel to the y-axis and above and below by simple 
subarcs of two of the arcs Ci. We can apply the formula 


[J fededy=J fay 


to each of the sets R: with boundary T;, since I; is intersected by 
each parallel to the x-axis in at most two points. Here the orienta- 
tion of the boundary curve +T; agrees with that of + C in the nonverti- 
cal portions and is that of increasing y on the right-hand boundary 
and of decreasing y on the left-hand one. Adding up the formulae 


1This assumption is not always satisfied. The boundary curve C may, for example, 
consist in part of the curve y = x? sin (1/x), which is cut by the x-axis in an infinite 
number of points and can not be decomposed into a finite number of arcs cut in only 
one point. 
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for i=1,...,.N the double integrals over the R: yield the double 
integral over R. In the line integrals over the +T; the contributions 
over the vertical auxiliary segments cancel out, since each segment is 
traversed twice, once upward, once downward. Hence, the line inte- 
grals over the curves +I; add up to that over the whole curve +C, 
and one obtains formula (5). In the same way one proves that 


J gy dx dy = - J gdz 


by dividing R by parallels to the x-axis through all of the end points 
of the arcs Ci. 

The same arguments also show that we can dispense with the 
assumption that the boundary C of R consists of a single closed curve 
C. The divergence theorem (2) applies just as well when C consists of 
several closed curves, as long as C can be decomposed into a finite 
number of simple arcs each intersected in at most one point by paral- 
lels to the axes. In taking the integral over + C we have to give each 
of the closed components of C the orientation corresponding to leav- 
ing R on the left-hand side. Decomposition by parallels to the y-axis 
still results then in regions whose boundary is intersected in at most 
two points by any parallel to the x-axis (see Fig. 5.3). 


Figure 5.3 


In this manner we prove the divergence theorem for more general 
regions R by decomposing R into regions for which the theorem has 
already been proved. Often, we can instead transform R into a region 
to which the theorem is known to apply. Writing the divergence theo- 
rem as 
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I dL =J L, 


we notice that the differential forms dL and L are defined independent- 
ly of coordinates, as explained in Section 3.6d, p. 322. Let 


x = x(u, v), y = y(u, v) 
be a continuously differentiable 1-1 transformation, with positive 
Jacobian, that takes R into a set R* with boundary C* in the u, v- 
plane. Then, 
L = f dy — g dx = f(yu du + yy dv) — g(xu du + xv dv) 

= (fyu — gxu) du + (fyo — gxy) dv 

= Á du + B dv, 
where 

A = fyu — gXu, B = fyo — 8Xv. 
The derivative of L computed in either x, y or u, v variables is given by 
dL = df dy — dg dx = (fz + gy) dx dy 
= dA du + dB dv = (Bu — Ay») du dv, 


so that (as can also be verified directly) 


(fx + av) T2 2- = Bu — A». 


Let C be referred to a parameter t: 
= x(t), y= y(t) axt<b, 


where the orientation of + C corresponds to increasing t. Using for the 
corresponding points of + C* the same parameter value t, we have for 
the line integrals of L over C and C* the common value 


fz = fea = J te — g Fat = J. [at + B]at. 
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Similarly, we have the same value for the area integrals in the two 


planes: 
ffa- {fie an dees 


d(x,y) 
J, (fz + 8) d(u,v) du dv 


J (Bu — A,) du dv. 


Hence, the divergence theorem for R 
J Ge + ey) dx dy = J (fdy — g di) 
will follow from the corresponding formula for R*, 
li (By — Ay) du dv = J. (A du + B dv). 


For the validity of the theorem for a region R, it is sufficient that R 
can be transformed into a region whose boundary consists of simple 
arcs intersected by parallels to the axes in, at most, one point. If, for 
example, the boundary C or R is a polygon, we can always rotate the 
figure in such a way that none of the sides of the polygon is parallel 
to one axis, and the divergence theorem will apply. 


5.2 Vector Form of the Divergence Theorem. Stokes’s 
Theorem 


Gauss’s theorem can be stated in a particularly simple way if we 
make use of the notations Of vector analysis. For this purpose we con- 
sider the two functions f(x, y) and g(x, y) as the components of a plane 
vector field A. The integrand of the double integral in formula (2) is 
denoted by div A, 


div A = f(x, y) + gy(x, y) 


and is called the divergence of the vector A (cf. p. 208). In order to ob- 
tain a vector expression for the line integral on the right side in the 
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divergence theorem, we introduce the length of arc s of the oriented 
boundary curve + C (cf. Volume I, p. 352). Here, the sense of increasing 
s is taken to correspond to the orientation! of the curve + C. The right 
side of identity (2) then becomes 


J Lf(x, 9 — g(x, y)ż] ds, 


where we put dx/ds = x and dy/ds = y. 

We now recall that the plane vector t with components x and y 
has unit length and has the direction of the tangent in the sense of 
increasing s and, hence, in the direction given by the orientation of 
C. The vector n with components € = ý and n = —<X has length 1, is 
perpendicular to the tangent, and, moreover, has the same position 
relative to the vector t as the positive x-axis has relative to the 
positive y-axis.? If, as usual, a 90° clockwise rotation takes the posi- 
tive y-axis into the positive x-axis, the vector n is obtained by a 90° 
clockwise rotation from the tangent vector t. Thus, n is the normal 
pointing to the “right” side of the oriented curve C (cf. Volume I, 
p. 346). Since in our case +C is oriented in such a way that the re- 
gion R lies on the left side of + C, it follows that n is the unit vector in 
the direction of the outward-drawn normal (see Fig. 5.4). The com- 
ponents &, n of the unit vector n are the direction cosines of the 
outward normal: 


E = cos 0, N = sin 0 


1In effect, this convention on s makes the value of a line integral of the form 


r= | h ds 
Cc 


independent of the orientation of C as long as the integrand h does not depend on the 
orientation. If C is represented parametrically in the form x = x(t), y = y(t) fora S 
t < b where the sense of increasing t corresponds to a particular orientation of C, 


then 
b ds 
I= f hds= | h Fat, 


where ds/dt > 0. In particular, I > 0 whenever the integrand h is positive along the 
curve. 

2We see this from considerations of continuity; we may suppose that the tangent to 
the curve is made to coincide with the y-axis in such a way that t points in the 
direction of increasing y. Then x = 0, y = 1, so that the vector n with components 
— = 1 and n = 0 has the direction of the positive x-axis. 
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x 


Figure 5.4 


if n forms the angle 0 with the positive x-axis. It is useful to notice 
that the components of n can also be written as directional derivatives 
of x and y in the direction of n: 


-y= n=- 8 
65I = ip n = X dn’ 


since for any scalar h(x, y) the derivative of h in the direction of n 
is given by 


d 
S? = hz cos 0 + hy sin O = Ehe + nhy 
(see p. 44) 
Gauss’s theorem therefore can be written in the form 
dx d 
dx dy = f | dx 2) 
(6) J aiv a x dy oan * oan ds 


Here the integrand on the right is the scalar product A • n of the 
vector A with components f, g and the vector n with components 
dx/dn, dy/dn. Since the vector n has length 1 the scalar product 
A -n represents the component An of the vector A in the direction 
of n. Consequently, the divergence theorem takes the form 
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(7 Í div A dx dy = | A -nds = | An ds. 
) l y= |, J, As 


In words, the double integral of the divergence of a plane vector field 
over a set R is equal to the line integral, along the boundary C of R, 
of the component of the vector field in the direction of the outward- 
drawn normal. 

In order to arrive at an entirely different vector interpretation of 
Gauss’s theorem in the plane we put 


a(x, y) = — g(x,y), (x, y) = f(x, y). 


Then, by (2), 
(8) iM (bz — ay)dx dy = | (aż + by)ds= | adx + bdy. 


If the two functions a and b are again taken as components of a 
vector field B (where at each point B is obtained from the vector A 
by a 90° rotation in the counterclockwise sense), we see that ax + by 
is the scalar product of B with the tangential unit vector t: 


ax+by=B-t= B, 


where B; is the tangential component of the vector B. The integrand 
of the double integral in (8) appeared on p. 209 as a component of the 
curl of a vector in space. In order to apply the concept of curl here 
we imagine the plane vector field B continued somehow into x, y, 2- 
space in such a way that in the x, y-plane the x- and y-components 
of B coincide with a(x, y) and b(x, y), respectively. Then bz — dy 
represents the 2-component (curl B)z of the curl B. The divergence 
theorem now takes the form 


(9) Í i (curl B); dx dy = J B; ds. 


We can formulate the theorem in words as follows: 


The integral of the z-component of the curl of a vector field in space 
taken over a set Rin the x, y-plane is equal to the integral of the tangential 
component taken around the boundary of R. This statement is Stokes’s 
theorem in the plane. 
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If we make use of the vector character of the curl of a vector field 
in space we can free the Stokes theorem from the restriction that the 
plane region R lie in the x, y-plane. Any plane in space can be taken 
as x, y-plane of a suitable coordinate system. We thus arrive at the 
more general formulation of Stokes’s theorem: 


(10) {J (curl B)n ds = J Bi ds, 


where R is any plane region in space bounded by the curve C, and 
(curl B)n is the component of the vector curl B in the direction of 
the normal n to the plane containing R. Here C has to be oriented 
in such a way that the tangent vector t points in the counterclockwise 
direction as seen from that side of the plane toward which n points. 

If the complete boundary C of R consists of several closed curves, 
these formulas remain valid provided that we extend the line integral 
over each of those curves, oriented properly so as to leave R on its 
left side. 

Of importance is the special case where the functions a(x, y), 
b(x, y) satisfy the integrability condition 


(11) ay = bz, 


that is, where a dx + b dy is a “closed” torm. Here the double 
integral over R vanishes and we find from (8) that 


f adx+b dy=0 
C 


whenever C denotes the complete boundary of a region R in which 
(11) holds. This again implies, as we saw on p. 96, that 


Jadx + bdy 


extended over a simple arc has the same value for all arcs that have 
the same end points and that can be deformed into each other with- 
out leaving R (see p. 104). 


Exercises 5.2 


1. Use the divergence theorem in the plane to evaluate the line integral 
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[, Adu+Bdv 


for the following functions and paths taken in the counterclockwise 
sense about the given region 


(a) A=au+bv,,y B=0, u20, ved, œu + ß?v <1 
(b) A=u«u?— v, B=2uv, |u| <1, |vj <1 
(c) A=v", B=u., tvr. 
2. Derive the formula for the divergence theorem in polar coordinates: 


J ga f(r. dr + ar, 0) a0 = ff + RE — Flas. 


r 


3. Assuming the conditions for the divergence theorem hold, derive the 
following expressions in polar coordinates for the area of a region R with 
boundary C, 


i r? dð, — f rô dr, 
+c* 


2 Jict 


where in the second formula we assume that R does not contain the 
origin. 
4. Apply Stokes’s theorem in the x, y-plane to show that 
d(u, v) 
— ds = | u(grad v) « t ds, 
Je d(x, y) „os “rad v) 


where t is the positively oriented unit tangent vector for C. 


5.3 Formula for Integration by Parts in Two Dimensions. 
Green’s Theorem 


The divergence theorem 


a2) f Ger added =f (E+ 02 ds 


[see formula (6)] combined with the rule for differentiating a product 
immediately yields a formula for integration by parts that is basic in 
the theory of partial differential equations. Let f(x, y) = a(x, y) u(x, y) 
and g(x, y) = b(x, y) u(x, y), where the functions a, u, b, v have con- 
tinuous first derivatives. Since here 


fz + By = (auz + buy) + (azu + byv), 


we can write formula (12) in the form 
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dx dy 
(13) i (auz + buy) dx dy = J (aw dn t bu 4 ds 
— ji (azu + byv) dx dy. 
R 


To obtain Green’s first theorem we apply this formula to the case 
where v = u and where a and b are of the form a = wz and b = wy. 
(We assume that u has continuous first derivatives and w continuous 
second derivatives in the closure of R.) We obtain the equation 


_ dx dy 
J (UzWz + UyWy) dx dy = f ulw dn + Wy a ds 


— f| U(W2rz + Wyy) dx dy. 
R 


Using the symbol A for the Laplace operator (p. 211), we write 
War + Wyy = Aw. 


Moreover, dx/dn and dy/dn are the direction cosines of the outward 
normal of the boundary C of R (see p. 552); thus, we have in 


the directional derivative of w taken in the direction of the outward 
normal to C.! In this notation Green’s first theorem becomes 


(14) {f (UzWz + UyWy) dx dy = f u dw ds — {f uAw dx dy 
R c an R 


If in addition u has continuous second derivatives, we obtain from 
(14) by interchanging the roles of u and v the formula 


ff (Wzlz + Wylly) dx dy = f w du ds — {f wAu dx dy 
R Cc R 


Subtracting the two relations yields an equation symmetric in u 
and w and known as Green’s second theorem: 


1Usually dw/dn is called, for short, the normal derivative of w. 
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dw du 
(15) J (uAw — wAu) dx dy = J (u dn & dn) ds. 
The two theorems of Green are basic in the study of the solutions of 


the partial differential equation uzz + Uyy = 0 (Laplace equation). 


5.4 The Divergence Theorem Applied to the Transformation 
of Double Integrals 


a. The Case of 1-1 Mappings 


The divergence theorem yields a new proof for the fundamental 
rule for transformation of double integrals to new independent 
variables (see p. 403). The divergence theorem for a region R with 
boundary C can be stated in the form 


(16) fa=f L 


+C 
[see formula (2a), p. 545].2 Here, putting f = b, g = —a, 
(17a) L = a(x, y) dx + b(x, y) dy 
(17b) dL = (bz — ay) dx dy. 
If the curve C has a parametric representation 
x=x(@), yy, astsB, 


where the sense of increasing t corresponds to the orientiation of + C, 
we can write the line integral in (16) as the ordinary integral 


$ 
(17c) f L={ adet+bdy= | ga 
+C +C a 


with the integrand 


1See the section on potential theory (p. 713). 

2Here and in what follows we always assume tacitly that the assumptions used in the 
proof of the divergence theorem are satisfied; that is, that R is an open set whose 
boundary C consists of a finite number of smooth arcs, each of which is intersected 
in at most one point by parallels to the axes. The coefficients of the linear form L 
are assumed to have continuous first derivatives in the closure of R. 
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(see p. 307). 
We now consider a mapping defined by functions 


(18a) u = u(x, y), v = U(x, y). 


We assume that the mapping is 1-1 in the closure of R and that the 
Jacobian d(u, v)/d(x, y) is positive throughout. Let R be mapped 
onto the set R’ in the u, v-plane and C onto the boundary C’ of R”. 
Moreover, C’ also shall consist of a finite number of smooth arcs, each 
of which is intersected in, at most, one point by any parallel to a 
coordinate axis. Since the Jacobian is positive, the orientation is 
preserved; that is, for increasing t the point (u, v) given by 


u = u(x(t), y(t), v = v(x(2), y()) 


describes the curve C’ in such a way that we leave the set R’ to our 
left. Referred to the coordinates u, v we have 


L = Adu + Bdv = A(uzdx + uydy) + B(uzdx + vydy) = adx + bdy, 


where the coefficients A, B in the u, u-system are connected with 
the coefficients a, b in the x, y-system by the relations 


a = Auz + Buz, b = Auy + Boy. 


Along C’ 


so that by (17c) 


EL B 
(18b) f r= |a= ["ddut+ Bav= | 7. 
+C va a +C 


Applying the divergence theorem (16) to the region R’ in the u, v- 
plane, we find that 


(18c) fir =f, az, 
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where, in analogy to (17b), 
= (Bu — Av) du dv. 
One verifies immediately that} 


bz — dy = (Auy + Buy)z — (Auz + Buz)y 

= (Antz + Avvz)Uy + (Butz + Bovz)vy — (Aully + Aovy)uz 
— (Butty + Bovy)vz 

= (Bu — Av) (Ucvy — UyUz). 


Thus, we conclude from (18b, c) and (16) that 


(19) IL. aL = ff, (Bu — Av) du dv = i dL 


= ff (bz — ay) dx dy = {f (Bu — Ay)? d(x, a dx dy. 


This formula contains the general law of transformation 


(a) ff, flu, ») du dv= ff fu d, v D E A dy 


for double integrals [see (16b), p. 403]. We only have to choose the 
functions A, B in (19) in such a way that A = 0 and B, = f(u, v). 
This means that for fixed v the function B shall be some indefinite 
integral of f (u, v) as a function of u alone: 


Bow, v) = |“ Kw, v) dw + hlo) 


where /A(v) is arbitrary and g(v) is chosen in such a way that the 
point (g(v), v) lies in R’. For the special function f = 1, formula 
(20) yields an expression for the area of the image region as a double 
integral: 


1This formula follows without any algebraic computations if we use the fact proved 
on p. 322 that dL can be formed for a form L without reference to any particular 
coordinate system; hence, by (56c), p. 308, 


_ dL dL dtu, v) 
“= “dx dy du dv d(x,y) 


d(u, v) 
d(x, y) 


bz —a = (B u — Av) 
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(20a) f „dudo = ni e os da dy 


Essentially formula (20) expresses the fact that the double integral 
of a second-order differential form œ = f du dv does not change under 
changes of the independent variables. This fact is proved here by 
expressing @ as derivative dL of a first-order form L, reducing the 
double integral to a line integral by means of the divergence theorem, 
and making use of the invariance of a line integral fL. 


b. Transformation of Integrals and Degree of Mapping 


It is interesting to observe what happens to the transformation 
formula (20) when the mapping 


= u(x, y), v = v(x, y) 


is no longer 1-1 and when its Jacobian is not necessarily positive. 
First, we look at the case where the mapping of R onto R’ is 1-1, but 
the Jacobian is negative throughout the closure of R. The only differ- 
ence in the argument leading to (20) is that now +C and +C’ have op- 
posite orientations: if increasing parameter values ¢ on C’ means leav- 
ing R’ on the left, then increasing t on C means leaving R on the right. 
In applying the divergence theorem (16) we assume that the boundary 
of the two-dimensional region is oriented in such a way that the re- 
gion lies on the positive (left) side of the boundary. The result is that 
formula (20)! has to be replaced by 


(20b) f _f du dv = - ff f Pea 2 dzd 


We can combine formulae (20) and (20b) into a single formula valid 
whenever the mapping from (x, y) onto (u, v) is 1-1 and the Jacobian 
is of constant sign: 


1Formula (20) applies unchanged if the two-dimensional regions Rand R’ themselves 
are considered as oriented manifolds. In that case, the sign of an integral over the 
manifold changes when the orientation of the manifold is reversed. A negative 
Jacobian for the mapping implies that R and R’ have opposite orientations, so that 
formula (20) persists if written as 
B d(u, v) 
ff p tiuaw= f| t de dy. 

Instead of orienting the regions, we can also replace the Jacobian by its absolute 
value as in formula (16b) on p. 403. 
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(21) [fen du dv = f f Eor 2? dx dy. 


Here the integral on the left side is to be extended over the whole 
u, v-plane, and the function €r = £g(u, v) is defined as 


d(u, v) 
d(x, y) 

More generally we consider the case where the mapping of R is not 
necessarily 1-1. We assume that we can divide R into subsets Ri, 


each of which is mapped 1-1 and in each of which the Jacobian is 
of constant sign £€r;. Then 


fiiin dey =% ff Facey 4 
= 2 ffer: du dv = Uh du dv. 


Here the last integral is extended over the whole u, v-plane, and the 
function yz stands for 


0 if (u, v) is not the image of a point of R 
ER(U, V) | 


sign if (u, v) is the image of a point of R. 


XR(U, v) = > Er,(u, v). 


Each term &r,(u, v), when (u, v) is image of a point of R;, is equal 
to the sign of the Jacobian at the point. Hence, the function ya(u, v), 
the degree of the mapping of R at the point (u, v), is the excess of the 
number of points of R with image (u, v) for which d(u, v)/d(x, y) is 
positive over the number of those points for which d(u, v)/d(x, y) 
<0. With this definition of y,(u, v) the transformation formula for 
integrals becomes 


d ? 
(22) SSi D xau v) du du = Ff flute, 9), oe, D Gee d d. 


Taking the constant 1 for f, we obtain the formula 


(23) ff cy sy dx dy = ff cate v) du dv, 


which generalizes formula (20a) to mappings with nonvanishing 
Jacobian that are not necessarily 1-1. 
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As an example, consider the mapping 


(24a) u = e? cos y, = e? sin y, 
for which 

d(u, v) 3 

~~ = e2% > 0) 

d(x, y) 


for all (x, y). Using polar coordinates r, 0 in the u, v-plane defined by 
u = r cos 9, v = r sin 9, we see that the image of the point (x, y) is the 
point with polar coordinates r = e7, 0 = y. Now let R be the rectangle 


(24b) 0< x< log 2, ~San<y<n 
The image points lie in the annulus 1 < r < 2 (see Fig. 5.5) The points 
of the annulus with u <0 are covered twice by the image of R 
(they can be assigned polar angles between 1/2 and 3n/2 or between 
—n/2 and —3n/2). The other points of the annulus are covered once. 


Figure 5.5 Degree of the mapping u = e? cos y, v = e* sin y 
applied to the rectangle 0 < x < log 2, |y| < 3/27. 
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Hence, 


0 for OSrsl1 or r=2 
Xr(u, v) = 42 for l<r<2 and u<0 
1 for l<r<2 and u20. 


Here, since each half of the annulus 1 < r <2 has area 3n/2, we have 
3 3 9 
Sf zatu, v) du dv = 22x) + ou= gA 


Alternatively, by direct calculation, 


dlu, v) E kk log 2 7 log 2 9 
eee y) dx dy = i dy | et dx = 3n f et dx = g T. 


We have the remarkable identity 
(25a) Xp(u, v) = uclu, v) 


between the (signed) number of times Xg(u, v) that the image R’ of R 
covers the point (u, v) and the number of times uc(u, v) that the image 
C’ of C winds about the point (u, v). Here the winding number is 
determined in accordance with the definition given in Volume I (p. 
431). Assuming that both the x, y- and u, v-coordinate systems are 
right-handed, we give to C the positive sense with respect to R, which 
corresponds to leaving R on our left. If on any portion y of C this 
sense is that of increasing values of some parameter t, we also orient 
the corresponding portion y’ of C’ according to increasing t. The 
number of times C’ winds about a point (uo, vo) not on C” is then the 
difference—here denoted by Hc (uo, vo)—between the number of times 
C’ crosses the ray u = Uo, v > vo from right to left and the number of 
times C’ crosses from left to right, following C’ in the sense assigned 
to it. 

Clearly, both sides in the equation (25a) are additive by definition; 
that is, dividing R into a finite number of subregions R; with bound- 
ary curves C; we have 


Xp(U, Vv) = > Xr;(U» V), uc(u, v) = 2 lic,(u, v). 


Hence, it is sufficient for the proof of (25a) to prove that 


(25b) Xp AU, V) = Hc;(u, v) 
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for any portion R; of R that is mapped 1-1 into the u, v-plane and in 
which the Jacobian d(u, v)/d(x, y) has a constant sign &r;. Let Ri have 
the boundary curve C;, and let Ri’ be the image of Ri, C’; that of Ci. 
Obviously, for any (u, v) not on Ci 


er; for (u, v) in Ri 
Xr; (U v) = . 

0 for (u, v) exterior to Ri. 
Moreover, C: is a simple closed curve whose orientation is counter- 
clockwise for €r; > 0, clockwise for €r; < 0 (see Section 3.3e, p. 260). 
Hence, the number of times C: winds about a point (u, v) also is Egr; 
for (u, v) inside C; and is 0 for (u, v) outside C:, which proves (25b). 

For the example on p. 563 the identity of x,(u, v) and uc(u, v) is 

immediate by inspection (see Fig. 5.5). 


5.5 Area Differentiation. Transformation of Au to Polar 
Coordinates 


On p. 387 we defined the notion of space differentiation of a triple 
integral. In two dimensions we deal with the corresponding concept 
of area differentiation of a double integral 


(26) M(R) = |] p(x, 3) dx dy. 


We assume here that p(x, y) is a continuous function defined in an 
open set S of the x, y-plane. With any (Jordan-measurable and closed) 
subset R of S we can then associate through formula (26) a value 
M =M(R). We denote by A(R) the area of R: 


A(R) = Í J dx dy. 


From the mean value theorem (p. 384) we know that the quotient 


MR) 
A(R) 


lies between the supremum and the infimum of p(x, y) in R. It follows 
that at a point (xo, yo) of S 


MR, 
(27) o(xo, yo) = lim rp 
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where the Rn are any sequence of subsets of S that have an area 
A(Rn), contain the point (xo, yo) and have diameters tending to 0 for 
n — oo, The limit is analogous to differentiation in one dimension. 
We call p the area derivative of M with respect to A. 

Physically, we can interpret the differential form p(x, y) dx dy 
(at least for p > 0) as the element of mass of a certain mass-distribu- 
tion in the plane, the integral M(R) representing the total mass 
contained in the set R. Equation (27) then shows the p(x, y) can be 
obtained as the limit of the masses of the sets Rn divided by their 
areas as the Rn shrink into the point (x, y). Calling M(Rn)/A(Rn) 
the average density of mass-distribution in the set Rn, we define 
p(x, y) as the density at (x, y), or as the mass per unit area. Ina different 
physical interpretation not restricted to positive p, we can think of 
p dx dy as element of electric charge, of M(R) as the total charge in R, 
and of p(x, y) as the charge density or charge per unit area. 

In a mapping 


x = x(x, y), Y= Wx, y) 


of points (x, y) of the plane onto points (x, y) the area of the image R 
of a set R is given by 


ai) {fase = [482 a 


[see formula (20a)]. Here clearly the J acobian 


d(x,9) _ ,. A (Rn) 
d(x, y) ~ BA (Ro) 


is the area derivative of the area of the image region with respect to 
the area of the original region. 

Imagine now that the plane is covered by a deformable elastic 
material where (x, y) is the position of a particle of the material at a 
certain time ¢ and that (x, y) is the position of the same particle at a 
later time ¢. Let p(x, y) denote the density of the material at the 
position (x, y) at the time ¢ and ((x, y) that at the time ¢ at (x, y). If 
we postulate that the total mass of the particles filling the set R at 
time tis the same as that of the same particles at the time £ when they 
fill the set R, then 


MÈ) = |f. ô dž dj = M(R) = || p dx dy 
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It follows that 


un MÈ) _ yn MBa) ARa) _ p 
nie A(Rn)  n-= A(Rn) A(Rn)  d&, Dd, 9) 


Hence, mass-densities in mappings (x, y) > (x, y) transform according 
to the rule 


d 
as » = GED 


This equation, written as a relation between differential forms (see 
p. 308), just states the law of conservation of elements of mass: 


(28a) p dx dy = p dx dy. 


Applying the notion of area differentiation enables us to trans- 
form the expression Au = Uzz + Uyy to new coordinates, for ex- 
ample, to polar coordinates (r, 9). For this purpose we use the formula 


du 
Au dx d = | Seas, 
If u ax ay co dn S 


which arises from Green’s theorem [see (15), p. 558] if we put w = 1. 
If we carry out area differentiation using a sequence of sets Ra with 
boundaries Cn shrinking into the point (x, y), we find 


(29) Au = lim ARa) RD i 


In order to transform Au to other coordinates, we therefore have 
only to apply the corresponding transformation to the simple line 
integral {(du/dn) ds, divide by the area, and perform a passage to the 
limit. The advantage over the direct calculation is that we need not 
carry out the somewhat complicated calculation of the second deriva- 
tives of u, since only the first derivatives occur in the line integral. 

As an important example, we shall work out the transformation of 
Au to polar coordinates (r, 9). For Rn we choose a small mesh of the 
polar coordinate net,! say that between the circles r andr + hand the 
lines ð and 0 + k, whose area, as we know, has the value 


A(R) = kh{r + 5 h). 


1Here h and k are supposed to tend to 0 as n > oo, 
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The first derivatives transform according to the formulae 


ur = 2 u(r cos 9, r sin 0) = + (xuz + yu) 


ue =< u(r cos 0, r sin 9) = — yur + xuy. 


On a circle r = constant the direction cosines of the normal (pointing 
in the direction of increasing r) are x/r, y/r, and hence, du/dn = ur, 
while ds = r dð. On a ray 9 = constant the direction cosines of the 
normal (pointing in the direction of increasing 9) are —y/r, x/r, and 
hence, du/dn = us/r while ds = dr. Thus, taking the integral of 
the derivative of u in the direction of the outward normal along the 
boundary Cn of Rn, we find 


du O+k 
; anos = | [((r + h)ur (r + h, 8) — rur (r, 9)] d8 


rth 1 
+ f F [uo (r, 0 + k) — ue(r, 9)] dr 
r 
O+k r+h 
= f dé { [rur(r, 9)]r dr 
o r 


rth Otk 1 
+ f dr f + ulr, J do 
r 0 r 0 
1 1 /1 
= Ja E (rur)r + P (+ us) | dr dð. 


Since here by the formula for area in polar coordinates (p. 000) 
A(Rn) = rdr dd 
(Rn) = J) 
we find from (29) that 
1 1 1 1 
(30) Au = P (rur)r + (+ us) = Urr + > + Ur + Ta 00, 
which is the required transformation formula. 


This formula suggests some important special solutions of the 
Laplace differential equation Au = 0. From (80) solutions of this 
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equation that depend on r alone—that is, that are of the form u = 
f(r)—must satisfy the condition 


E Uf (ry = 0 


which leads to rf’(r) = constant = a or to 
(31a) u = f(r) = a log r + b = a log Vx? + y? + b, 


where a and b are constants. Similarly, we find that the general 
solution of Laplace’s equation that depends on 8 alone has the form 


(31b) u = c® + d = c arc tan — + d, 


with constants c and d. 


5.6 Interpretation of the Formulae of Gauss and Stokes by 
Two-Dimensional Flows 


Our integral theorems find their most natural interpretation in 
terms of the motion of a liquid moving in the x, y-plane. The motion 
shall be described at every moment by its velocity field.! The particle 
that occupies the location (x, y) at the time t shall have the velocity 
vector v = (vı, U2). 

If the velocity of the liquid were independent of x, y, t, the liquid 
that crosses a line segment J during the time interval from ż to t + dt 
fills at the time t + dt a parallelogram of area (v - n) s dt, where s is 
the length of I and n is the unit normal vector to I pointing to the side 
of I to which the liquid crosses (see Fig. 5.6).? If instead we arbitrarily 
choose for n any one of the two unit normal vectors to J, then (v - n)s dt 
is the area filled by the liquid crossing J in the time interval from 
t to t + dt, counted positive if the liquid crosses toward the side to 
which n points, and negative otherwise. If p is the density of the 


1The motion in the x, y-plane may be thought of as part of a motion in x, y, 2-space, 
in which the velocity of any particle is parallel to the x, y-plane and is independent 
of the z-coordinate. 

2The parallelogram is formed by the points (x, y) for which the segment with end 
points (x, y) and 


(x, y) = (š — vı dt, ý — v2 dt) 


has points in common with I. 
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a 


= 
———— 


Figure 5.6 Amount of liquid 
crossing segment J in time dt 
for uniform flow of velocity v. 


liquid, then (v - n) p s dt is the mass of the liquid that crosses J toward 
the side to which n points. 


Let C be a curve in the x, y-plane. Along C we arbitrarily select 
one of the two possible unit normal vectors and denote it by n. In 
a flow with velocity and density depending on x, y, ¢ the integral 


(32a) Í, (v - n)p ds 


represents the mass of the liquid crossing C in unit time toward that 
side of C pointed to by n. This follows immediately by approximating 
C by a polygon and the flow by one for which the velocity is constant 
across each side of the polygon. 

If C is the boundary of a region R and if n is the outward drawn 
normal the integral represents the mass of the liquid leaving R in unit 
time.! Applying the divergence theorem in the form (7), p. 554, we 
can express the flow through C as a double integral: 


(32b) J (v-n)pds = J (pv) - nds = I div (pv) dx dy. 


We can compare this flow of mass through C out of R with the 
change of mass contained in R. The total mass of the liquid contained 
in the region R at the time ¢ is? 


1This will be a negative quantity if the net flow is into R. 
2This generally is a function of t, since p = p(x, y, t) is permitted to vary with t. The 
region R and its boundary C are held fixed in the present consideration. 
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il p dx dy. 


Thus, in unit time there is a loss of mass contained in R by the amount 


7 AI p(x, y, t) dx dy = — ff px, y, t) dx dy. 


If we assume that mass is preserved, then mass can only be lost to R 
by passing through the boundary C. Hence, by (32b), we must have 


(32c) Í J, div (pv) dx dy = — Í J, p:e dx dy. 


This identity holds for arbitrary regions R. Dividing by the area of R 
and shrinking R into a point (that is, by area differentiation), we find 
in the limit that 


(33) pe + div (pv) = 0 


(cf. Section 4.6, Exercise 15). This differential equation! and the in- 
tegral relation (32c) express the law of conservation of mass in the 
flow. In terms of the components vı, v2 of the velocity vector we can 
write (33) as 


ap 9, 3P, (au ava _ 
(33a) apt gg tage + Plas a =O 


An important special case of this equation arises when we deal with 
an incompressible homogeneous medium in which p has a constant 
value independent of location and time. In that case equations (83) 
or (33a) reduce to an equation for the velocity vector alone: 


(34) div v = OU 4 02 _ 


ax t Dy =O 


It follows from (32b) that the total amount of an incompressible liquid 
crossing a closed curve C in unit time is 0: 


(35) J vends=0. 


1In mechanics often referred to as the continuity equation. 
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Stokes’s theorem (9), p. 554, applied to the vector v also has an in- 
terpretation in terms of fluid flow. The integral extended over a 
closed oriented curve C 


J ve tds, 


where t is the unit tangent vector corresponding to the orientation 
of C, is called the circulation of the fluid around C. By Stokes’s theo- 
rem the circulation is equal to the double integral 


Í J. (curl v)z dx dy 


over the enclosed region R. Hence, the quantity 
(36) (curl v): = =—- — 5>, 


which is called the vorticity of the motion, measures the density of 
circulation at the point (x, y) in the sense that the area integral of the 
vorticity gives the circulation around the boundary. 

A flow is called irrotational if the vorticity vanishes everywhere, 
that is, if 
dv, dui _ 


— = 0. 


(37) 0x oy 


By Stokes’s theorem the circulation around a closed curve C vanishes 
if C is the boundary of a region where the motion is irrotational. 
Since (37) is the condition for vı dx + v2 dy to be an exact differential 
(see p. 104), there exists for an irrotational flow in every simply con- 
nected region a function ọ = 9 (x, y, t) such that 


(38) V1 = — Qa, v2 = — Dy. 


The scalar » (which is determined within a constant) is called a 
velocity potential. In vector notation (38) can be replaced by the single 
equation 


(38a) v = — grad 9. 


The irrotational motion of an incompressible homogeneous liquid 
satisfies both equations (37) and (34). Substituting for vı and v2 in (34) 


Relations Between Surface and Volume Integrals 573 


their expressions from (38), we find that the velocity potential is a solu- 
tion of Laplace’s equation: 


AQ = Prz + Pyy = 0. 


As an example, we consider the flow that corresponds to the 
solution 


=a log r = a log vx? + y? 


of the Laplace equation [cf. (81a), p. 569]. By (38) the velocity vector 
v has components 


and is singular at the origin (see Fig. 5.7a). All velocity vectors point 
towards the origin for a > 0, away from the origin for a < 0. In this 
example the velocity of the liquid at a given location does not change 
with time, although we have different velocities at different points; 
we speak of a steady flow. The circulation around any closed curve 
C not passing through the origin vanishes, since 


e «o e č O7 
-IAN NT 
tf \ 


Figure 5.7 (a) Flow with sink. (b) Flow with vortex. 


J y tds = f ndx+udy=— | dọ =0. 


On the other hand, the amount of liquid passing outward through the 
closed curve C in unit time is 
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dy — 
p | vends = pf (uJ V1 n ~ + U2 d) a ds =p | vı dy- ve dx 
C C 


d d 
- ap [Ert ap f ao, 


where 0 is the polar angle from the origin. Since (see p. 354) 


a { a 
T Je 


is an integer that measures the number of times C winds around the 
origin, we see that if the closed curve C is simple, does not pass 
through the origin, and is oriented counterclockwise, 


p | v-nds = 
c 


Thus, the same amount of mass flows in unit time through every 
simple closed curve C enclosing the origin. For a > 0 the origin is a 
sink, where mass disappears at the rate of 2xap units in unit time. 
For a < 0 we have a source of mass at the origin. 

The opposite behavior is encountered if we consider the steady 
flow with velocity potential [see (31b), p. 569] 


0 if C does not enclose the origin 


— 2rap if C encloses the origin. 


= cô = c arc tan BA 
x 


While ọ itself is a multiple valued function, the corresponding ve- 
locity field has univalued components 


The vector v is perpendicular to the radii from the origin. (Fig. 5.7b). 
Again the velocity field is singular at the origin. 
The circulation around a closed curve C has the value 


f v1 dx + vz dy = — f dọ = — c f do. 


Hence, the circulation is zero for a simple closed curve not enclosing 
the origin. For a simple closed curve running around the origin in the 
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counterclockwise sense we find the value —2zc for the circulation. 
This corresponds to a vortex of strength —2nc concentrated at the 
origin. On the other hand, the flow of mass in unit time through any 
closed curve C not passing through the origin is 0, since here 


p f v-nds=p | vidy- vds 
c 


_ op | 2oe toy 
a fd 
= cp or 


Thus, the origin is not a source or sink of mass. 


5.7 Orientation of Surfaces 


The theory of integration for three independent variables includes 
not only triple integrals and line integrals, which we have discussed 
previously, but also the concept of surface integral. In order to explain 
the latter, we begin with considerations of a general nature, which 
at the same time will serve to refine our previous ideas relating to 
double integrals. In treating integrals of a differential over a curve 
C in the plane or in space (p. 89), we found it necessary not just 
to consider C as a set of points in space but to assign to it a certain 
sense, or orientation. The same holds when we consider integrals of 
differential forms over surfaces in space of three or more dimensions. 
Similarly, the definition of integrals of third-order differential forms 
over three-dimensional manifolds requires a definition of orientation 
for such manifolds. In discussing this topological concept of orienta- 
tion we shall restrict ourselves to the simplest situations of curves, 
surfaces, and such lying in a euclidean space of any dimension and 
possessing smooth parametric representations in a sufficiently small 
neighborhood of any point. 


a. Orientation of Two-Dimensional Surfaces in Three Space 


In Section 3.4, we described surfaces in three-dimensional space 
by means of their parametric representations. In what follows we use 
a somewhat refined notion of a surface, as a set of points in space 
that exists independently of any particular parametric representation 
and that for its complete description may even require several systems 
of parameters. We define a two-dimensional surface S as a set of points 
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x, y, 2-space with regular local representations by means of two pa- 
rameters. That is, in a neighborhood of any point Po of S the position 


vectors X = OP = (x, y, z) of the points P of S are representable in 
the form | 


(39a) X = X(u, v) 


where the parameters u, v range over an open set y in the u, v-plane 
and different (u, v) correspond to different points on S. We require, 
moreover, the representation (39a) to be regular in the sense that the 
vector X(u, v) has derivatives Xu = (Xu, Yu, Zu) and Xv = (Xv, Yv, Zv) 
with respect to u, v in y that are continuous and linearly independent. 
Independence of the vectors Xu, X» is expressed algebraically by the 
condition [see formula (40d) p. 279] 


(39b) Xa x Xo +0 


or by 


(39c) T(Xu, X») = = | Xu x Xy|? > 0 


X, ° Xu X» ° Xv 


where I denotes the Gram determinant of the vectors Xu, X» [see 
p. 191 and formula (45a), p. 284]. 

The vectors Xu(u, v) and X,(u, v) at a point P = X(u, v) of S with 
parameters u, v are tangential to S at P and “span” the tangent plane 
n(P) of S at P; that is, every point of the tangent plane has a posi- 
tion vector of the form 


X (u, v) + AXu(u, v) + [LX (u, v) 


with suitable constants A, u (see p. 144). We orient the surface S by 
assigning an orientation to each of the tangent planes of S in a con- 
tinuous manner. We shall give a precise meaning to this statement. 


1Even for as simple a surface as a sphere we cannot hope to find a single regular 
parametric representation for the whole surface. For that reason we only require 
existence of local representations for S. Incidentally, we exclude surfaces that have 
edges and corners, where no regular local representation is possible (for example, 
cubes). 

More generally, a (simple) m-dimensional surface in n-dimensional x1, . . ., Xn- 
space is defined as a set of points with local parametric representations of the form 
| X = X(u1,..., Um), 
where the first derivatives of the vector X with respect to the variables ux are con- 
tinuous and linearly independent. 
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An oriented tangent plane n*(P) is obtained from the plane n(P) by 
specifying an ordered pair of independent vectors E(P) and (P) in 
n(P). The orientation of n* is then that of the ordered pair &, ņ or, sym- 
bolically,! 


(40a) O(n*(P)) = E(P), n(P)). 


Any other ordered pair of independent tangential vectors &’, y at P 
determines the same orientation if 


/ 


> 0; 


(40b) [5, n; 5’, "= an ° 
ne-E n 


n 
1’ 


(see p. 196). More generally, 
(40c) QE, n) = sgn [6, n; 8, n] QE, n’) 


The orientation Q(n*) can be described more easily in terms of the 
unit vector (see Fig. 5.8) 


Figure 5.8 


‘We can picture 2(n*(P)) as a sense ofrotation in the plane n(P); namely, as the sense 
of that rotation by an angle less than 180° that takes the direction of the vector E 
into that of y. 
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(40d) s= : eal 


which is normal to & and y and, hence, to the tangent plane n(P). 

The vector § does not depend on the individual pair of tangential 

vectors &, ņ but only on the orientation determined by the vectors. 

This follows from the general identity for vector products! 

rc 
ng 


|= É, n; §', n1. 
nen 


(40e) (§xn)- (x)= 


If here the ordered pairs of tangential vectors & ņ and &’, n’ give the 
same orientation to 7, then by (40b) the corresponding unit normals 
C and C’ satisfy 


,_ É, n; 55 n] 
ay oF = Tex mlle x aT 


Since ¢ and —€ are the only possible unit normal vectors, it follows 
from (40f) that ¢/ = ¢. 

We now say that the orientations Q(x*(P)) determined by (40a) 
from pairs of tangential vectors &(P), n(P) vary continuously with P 
if the unit normal vector § given by (40d) depends continuously on 
P. An oriented surface S* is defined as a surface S with continuously 
oriented tangent planes 2*(P). If the orientation of n* is given by 
(40a), we write symbolically 


(40g) Q(S*) = Q(r*) = OE, n). 


Any unit normal vector § at a point P of S determines an orienta- 
tion of the tangent plane n(P), namely, the one given by Q(E, n), 
where & 1 are any tangential vectors for which § x y has the direc- 
tion of 6. By formula (71c), p. 181, 


(40h) det (6,0, S)=6-( x n) = |§ x n| > 0. 


Hence (see p. 186), ¢ is that unit normal vector of S at P for which the 
triple of vectors ©, &, ņ is oriented positively with respect to the coordinate 
axes; that is, 


1The identity can be verified directly by writing it in terms of the components of the 
vectors involved; see also Exercise 9b, Section 2.4, p. 203. Formula (39c) is the special 
case § = & = Xu, 1 = y = Xo. 
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(401) QC, 6, n) = Q(x, y, 2). 


An orientation of S consists then in choosing in a continuous fashion 
a unit normal vector ¢ at all points of S. Here © is given by (40d) 
whenever Q (S*) = Q(&, n) for the oriented surface S*. We say that 
C is the unit normal vector pointing to the positive side of the oriented 
surface S* or is the positive unit normal of S*.? 

Let S be a connected surface, that is, one with the property that any 
two points of S can be joined by a curve lying on S. It is then easy to 
see that either S cannot be oriented at all or that there are exactly 
two different ways of orienting S.3 For two orientations of S corre- 
spond to two choices € (P) and ¢(P) of unit normal vectors on S. Here, 
necessarily, ¢’ = eC, where £ = e(P) has one of the values +1 or —1. 
Since, by assumption, the vectors § and @ vary continuously with P, 
the same holds for the scalar e(P) = ¢- G’. Thus, € is a continuous 
function on S assuming only the values +1 or —1. If e(P) = (Q) 
for any two points P, Q on S, it would follow from the intermediate 
value theorem that € = 0 somewhere along a curve on S joining P 
and Q, contrary to the definition of £. Consequently, € has the same 
value at all points of S. Thus, any orientation of S is either the one 
described by the normal €(P) or the one described by —¢ (P).If S* is the 
oriented surface with positive normal ¢, we write — S* for the one with 
the other orientation of S, so that 


(40)) Q(— S*) = —Q(S*). 


Obviously, the orientation of the positive normal € to a connected 
surface S at a single point P uniquely determines the positive normal 
at any other point Q and, hence, determines the orientation of S. We 


1Formula (40i) shows that the sense of rotation of the plane n associated with QE, n) 
appears counterclockwise when viewed from that side of n to which ¢ points, 
provided the x, y, 2-coordinate system is right-handed. Notice that the connection 
between Q(&, ) and the direction of § depends on the orientation of the coordinate 
system used, since the vector product § x ņ depends on that orientation. 

“More generally, any nontangential vector ¢ with initial point P is said to point to 
the positive side of S* if (40i) holds. For a “material” oriented surface, say a thin 
metal sheet, the two sides of the surface can be painted in distinctive colors. The 
pigment layer on the positive side would then only occupy points that can be 
reached by starting at a point P of the surface and moving a short distance in the 
direction of the positive normal to the surface. 

3The assumption that S is connected is essential. For a surface consisting of several 
disjoint connected components, the individual components might be oriented inde- 
pendently of each other. That there exist surfaces that cannot be oriented at all will 
be shown on p. 583. 
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only need to connect Q to P by a curve C on S and define a unit normal 
to S along C that coincides with ¢ at P and varies continuously along 
C; the normal then also coincides at Q with the positive normal. 

It is particularly simple to orient a surface S that forms the bound- 
ary of a three-dimensional region R of space (here S need not be con- 
nected, as inthe case of a spherical shell R). At each point P of S we 
can distinguish an interior normal pointing into R and an exterior 
normal pointing away from R, both varying continuously with P. 
Taking the exterior normal as positive normal defines an orientation 
for S. We call the corresponding oriented surface S* oriented positively 
with respect to R.1 

If, for example, R is the spherical shell 


(40k) as |X| <b, 
the positive oriented boundary S* of R has the positive unit normal 
(40) = —X/a for |X| =a and 6 = X/b for |X| =b. 


Let a portion of the oriented surface S* have a regular parametric 
representation X = X(u, v) for (u, v) varying over an open set y of the 
u, v-plane. Then, 


Xu X Xv 
Z=- 5 
(40m) [Xu x Xol 
defines a unit normal vector for (u, v) in y. If § is the positive unit 
normal of S*, we have 


Il 
M 
N 


(40n) a 


1As defined here, the positive orientation of the boundary S of a region R depends 
on the orientation of the x, y, z-coordinate system or on the orientation of three-space 
determined by that system. It is often more convenient to think of R also as oriented 
and to define unambiguously the oriented boundary S* of the oriented connected 
region R* in three-space. Here the “orientation” of R* consists of a particular choice 
of x, y, 2-coordinate system, which then is "oriented positively with respect to R”’ by 
definition: 
Q(R*) = Q(x, y, z). 

The positively oriented boundary surface S* of R* (usually denoted by 3 R*) is defined 
such that 

QC, §, n) = Q(R*) 
whenever &, ņ are tangential vectors at a point P of S with Q(S*) = Q€, n), and § 
is the exterior normal unit vector at P. 
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with £ = e(u, v) = +1. Since both ¢ and Z are continuous, it follows 
that £ is continuous and, hence, constant in any connected part of 
y. For £ = 1, that is, for 


(400) QCS*) = Xu, Xo), 


we say that S* is oriented positively with respect to the parameters 
u, v and write 


(40p) O(S*) = Q(u, v). 


If the same portion of S* has a second regular parametric representa- 
tion in terms of parameters u’, v' varying over a region y’, we have by 
formula (42), p. 283, 


_ (dy, z) d(z, x) d(x, y) 
Xu x Ko = A v) d(u, vY d(u, z) 


_ au’, v’) 
~ dlu, v) 


(40q) 
(Xy x Xv’). 


Hence, the unit normals Z and Z’ corresponding to the two parametric 
representations are related by 
d(u’, v’) 


(40r) Z= sgn “dlu, v) Z’. 


Thus, if S* is oriented positively with respect to the parameters u, v, 
then it is also positively oriented with respect to the parameters wv’, v’, 
provided 


d(u’, v’) 59 


(40s) dlu v) . 


In illustration, we consider the unit sphere S* with center at the 
origin, oriented positively with respect to its interior. Using u = x, 
v = y as parameters for z + 0, we have 
(40t) X = (u, v, € V1 — u? — v2), where € = sgn z. 

The corresponding normal vector Z defined by (40m) becomes here 


Z = (ex, £y, ez) = &, 


where ¢ is the exterior unit normal. Hence, S* is oriented positively 
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with respect to the parameters x, y for z > 0 and negatively for z < 0 
(see Fig. 5.9). 


A 


Figure 5.9 


A surface in three-space for which no distinction between the sides 
can be made or along which we cannot select a continuously varying 
unit normal cannot be orientable. The simplest example of a “one- 
sided” surface of this type, shown in Fig. 5.10(a) is called a Möbius 


Figure 5.10(a) Mobius band. 


band after its discoverer. We can easily make such a surface out of a 
rectangular strip of paper by fastening the ends of the strip together 
after rotating one end through an angle of 180°. If we start out with 
the rectangle 0 < u < 2r, —a<u<a (where 0<a<1) in the 
u, U-plane, we arrive at a Möbius band if we move each segment u = 
constant rigidly in such a way that its center moves to the point 
(cos u, sin u, 0) of the unit circle in the x, y-plane and such that it be- 
comes perpendicular to that circle and makes the angle u/2 with the 
positive z-axis (the assumption a < 1 keeps the surface from intersect- 
ing itself). The resulting band S has the parametric representation 
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(40u). X= | (1+ v sin | cos u, (1 + U sin -7 sin U, v cos 5] 
with v restricted to the interval —a < v < a. The points (u, v), 
(u + 4r, v), (u + 27, —v) in the u, v-plane correspond to the same 
point on the surface. If for an arbitrary point Po of S we make 
one possible choice uo, Vo of parameters, formula (40u) yields a 
regular local parametric representation of S for u, v restricted to 
the rectangle y given by 


Uo— T <U< Uot, —-a<u<a. 


Along the center line v = 0 of the surface, equation (40m) defines a 
unit normal vector 
u . u , 
Z = [cos u cos -y Sin u cos —>, —sin -y 

that varies continuously with u. Starting out with the unit normal 
Z = (1, 0, 0) at the point (1, 0, 0) of S corresponding to u = 0 and 
letting u increase from 0 to 2r, we describe a complete circuit along 
the center line of the surface returning to the same point but with the 
opposite unit normal Z = (— 1, 0, 0). We would find similarly that carry- 
ing during our motion a small oriented tangential curve we return to 
the same point with the orientation reversed. Thus, it is not possible 
to choose a continuously varying unit normal, or a side of S, or to 
choose a sense of rotation on Sin a consistent way. The one-sidedness 
of the Möbius band is strikingly illustrated by the insects crawling 
along the band in the drawing by M.C. Escher, reproduced in Fig. 
5.10(b). We see that a surface does not automatically enjoy the prop- 
erty of orientability. 

We oriented a surface by orienting its tangent planes in a con- 
tinuous manner. The orientation of the tangent planes n*(P) was 
described by a suitable pair of independent tangential vectors &(P), 
n(P). When it came to defining “continuity” of Q(x*) = Q(E, n), we 
made use of the normal vector ¢ formed according to (40d) and re- 
quired ¢ to be continuous. It is desirable to define continuity of the 
orientations Q(&(P), n(P)) without recourse to normal vectors or 
cross products. This is of particular importance when it comes to 
defining orientation for manifolds in higher-dimensional spaces, say, 
for a two-dimensional surface S in four-dimensional euclidean space. 
Here again, orientation of each tangent plane can be described by an 
ordered pair of independent tangential vectors £, n. But there is no 
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Figure 5.10(b) Band Van Mobius II, by M. C. 
Escher (Escher Foundation, Haags Gemeente- 
museum, The Hague, Netherlands). 


unique unit normal vector or “side” of S we can associate with S. 
We also cannot require the tangential vectors &(P), n(P) describing 
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Q (1*) to be defined and continuous for all P on S.t We discuss short- 
ly two definitions of orientation of surfaces in three-space equivalent 
to the one given before, but not involving normals and, hence, capable 
of generalization to higher dimensions. 

Any regular parametric representation X = X(u, v) of a portion 
of a surface of S in three-space determines a continuously varying 
unit normal Z on that portion by means of formula (40m). Let there 
be given a number of regular parametric representations for different 
portions of S. They will then define a continuously varying unit 
normal on all of S and, hence, an orientation of S, provided at least 
one of the representations is valid near any point P of S and provided 
any two representations valid at P lead to the same unit normal vector 
Z. By (40r) the latter condition simply requires that 

d(u’, v’) 
(41a) dlu, v) >0 
wherever two of the representations with parameters u, v and u’, v’ 
hold. The surface is then oriented positively with respect to each of 
the given parametric representations. 

For instance, various portions of the unit sphere S have the regular 
parametric representations 


(41b) X = (sin u cos v, sin u sin v, cos u) 
for O<U<T, vo -2<U<Ut+2Z 
(41c) X = (u, v', V1 — u? — vV?) for u2+v?<1 
(41d) X = (v", u", — v1 — u”? — v"? for u”? +v”?<l. 


It is easily seen that all of these representations define an orientation 
of S. For example, both (41b) and (41d) apply on the hemisphere z < 0, 
and there 
1I 4} ` ` . 
d (u”, v”) _ d(sin u sin v, sin u cos U) _ -sin u cos u > 0. 

d (u, v) d (u, v) 
The unit normal Z obtained from all these parametric representations 
is the exterior normal, and the orientation of S is the one that is 
positive with respect to the interior. 
1Even for as simple a surface as a sphere in three-space no nonvanishing tangential 
vectors E(P) can be found that are continuous at all points of the surface. We can, 


however, always choose the vectors §(P), n(P) in such a way that they vary continu- 
ously in a neighborhood of a given point. 
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The second method to be mentioned expresses the condition of 
continuity of Q(E(P), n(P)) directly in terms of the vectors &, n. 
Let €(P) be the unit normal vector associated with &, ņ by (40d). In 
a neighborhood of a given point Po of S, a regular parametric rep- 
resentation X = X(u, v) holds, defining a continuously varying 
normal vector Z by (40m). Then ¢(P) = (P) Z(P) with a certain 
e (P) = +1. Continuity of the vector ¢ (P) at Po obviously is equivalent 
to the condition e(P) = constant near Po or to the condition 


E(P) - &(Po) = e(P) e(Po) Z(P) -© Z(Po) > 0 


for all P sufficiently close to Po. Now, using the identity (40e), we 
find that 


py _ EP), 1P); (Po), Po) 
SCP) -+ So) = TEP) x nPI EP) x Pol” 


Consequently, the orientations Q(6, n) vary continuously and define 
an orientation of the surface S if for every Po on S 


(41e)? [5(P), n(P); 5(Po), nPo)] > 0 


for all points P on S sufficiently close to Po. 
For example, let S be the unit sphere x? + y2? + 22 = 1. For any 
point (x, y, z) on S that is not one of the poles (0,0, + 1), the vectors 


& = (xz, yz, z2? — 1), n= (—y, x, 0) 


are independent and tangential, since they are perpendicular to the 
position vector X = (x, y, z). With the additional choice of 


E = (1, 0, 0), n = (0, E, 0) 


at the pole (0, 0, €), where £ = +1, the orientations Q(€, ņ) are con- 
tinuous at every point Po of S. This is clear when Po is not one of the 
poles, since then & and yn themselves are continuous and not zero. 
Thus, one only has to verify condition (4le) when Po is a pole. 
For example, for the “north pole”? Po = (0, 0, 1) and for any point 
P = (x, y, z) in the “northern hemisphere” 


1One can deduce directly from formula (85c), p. 199, that (41e) is a relation between 
QO(n*(P)) and Q(r*(Po)) alone and does not depend on the particular vectors &(P), 
n(P), &(Po), n(Po) used to represent the orientations of those tangent planes. 
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E(P) + (Po) = §(P) + n(Po) 
n(P) -< &(Po)  n(P) + n(Po) 
xz yz 


=y 


[5(P), n(P); &(Po), n(Po)] = 


= G+ y)2>0 


except for P = Po. But, of course, also 


[5(Po), n(Po); 6(Po), n(Po)] = A 1 


b. Orientation of Curves on Oriented Surfaces 


We saw that it is possible to distinguish a positive and negative 
side of an oriented surface S* lying in a space with a certain orienta- 
tion of the coordinate system. In the same way, we can define the posi- 
tive and negative sides of an oriented curve C* lying on an oriented 
surface S*. Let § be a vector tangential to the curve at a point P and 
pointing in the direction determined by the orientation of C*:! 


(41f) Q(G) = Q(C*). 


Let ņ be a vector tangential to the surface at P and linearly independ- 
ent of §. We say that n points to the positive side of C* if 


(41g) O(m, §) = Q(S*). 


Conversely, we can orient a curve C lying on an oriented surface 
S* by requiring that a given vector ņ not tangential to C point to the 
positive side of C.? 

There is a natural way to orient a curve C when C forms part of the 
boundary of a region o lying on an oriented surface S* if we require 
o to lie on the negative side of the oriented curve C*. More precisely, 


1If X = X(t) is a parametric representation of C* and Q(C*) corresponds to’ in- 
creasing t, the vector € is to have the same orientation as dX/dt. 

“In order to achieve greater consistency for higher dimensions the notation for 
positive and negative sides of a curve has been changed from the one used in Volume I 
(p. 342). Consider the special case, where S* is the plane with the usual counter- 
clockwise orientation when viewed from a certain side. If C* is an oriented arc with 
the tangent vector & pointing in the direction given by the orientation of C*, then by 
(41g) a vector ņ points to the positive side of C* if a counterclockwise rotation by an 
angle less than 180° takes n into &; that is, ņ points to the right side of C* if we look 
in the direction of &. 
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we call C* oriented positively with respect to o if a vector ņ tangential 
to S* at a point P of C* and pointing away from o points to the positive 
side of C*. Conversely, we can indicate the orientation of a surface 
S* graphically by taking a region o on S* and marking the positive 
orientation of its boundary curve (see Fig. 5.11). 


Figure 5.11 Oriented curve C* 
on oriented surface S*. 


If an oriented surface S* is divided into portions Si, Se, . . ., Sn, 
then any arc C that separates a portion S; from a portion Sx receives 
opposite orientations when oriented positively with respect to those 
portions. This follows immediately from the fact that any vector n 
tangential to S at a point P of C and pointing into Sı points away 
from Sx (see Fig. 5.12). 


Figure 5.12 


Exercises 5.7 


1. Let S be the two-dimensional surface (‘product of two circles”) in four- 
space given by 

1In this manner of indicating orientation of a surface S* by that of a curve C* on it, 

we have to specify clearly the set o with respect to which the curve C* is to have 

positive orientation. Ordinarily, C* is a “small” simple closed curve dividing S into 

two portions, exactly one of which is also small and which is then taken for o. 
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X = (cos u, sin U, cos v, sin v). 
Prove that the vectors 
5 = (— xX», Xis — X45 X3), n = (— Xo, Xis Xis — X3) 


determine an orientation on S. 

2. Let S* be the torus with the parametric representation given in Chapter 
3 (p. 286) and oriented positively with respect to the parameters 9, ¢. 
Prove that S* is oriented positively with respect to its interior. 

3. Let S be the Möbius band represented parametrically as in (40u). 

(a) Show that the line v = a/2 divides S into an orientable and a 
nonorientable set. 

(b) Show that the line v = 0 does not divide S, that is, that the set S1 
of points obtained by removing from S all points with v = 0 is still 
connected. 

(c) Show that Sı is orientable. 


4. Let &, n, be independent vectors in the plane x. Put a = [§|2, b= 6&4, 
c = ||? and form for any t the vector 


b , in ¢ 
RO = (cos t= Fee asin t) 8+ aN 


Prove that R(t) is obtained by rotating the vector & in the plane z by 
an angle ¢ in the sense given by the orientation Q(6&, n). 


5.8 Integrals of Differential Forms and of Scalars over Surfaces 


a. Double Integrals over Oriented Plane Regions 


In the original definitions of single and multiple integrals, say as 
limits of Riemann sums, orientation plays no role. The integral of a 
function f is based on the use of length, areas, volumes, and so on, of 
elementary figures that, naturally enough, are given positive values. 
The use of signed quantities, amounting to the introduction of orien- 
tations, however, imposes itself right away if we want to have simple 
rules of operating with integrals.! Thus, the definite integral 


f Í f(x) dx 


Generally, mathematics would become intolerably clumsy if we restricted ourselves 
to using only positive quantities, for example, to positive distances instead of signed 
distances as coordinates. This would necessitate inumerably many distinctions 
between different cases in the proof and statement of simple theorems. Positivity 
is an essential element in the formulation of inequalities between mathematical 
objects but complicates the formulation of most identities, which are based 
usually on unrestricted algebraic manipulation of quantities. 
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is defined as limit of Riemann sums for a < 6. If we want the additivity 
rule 


f fix) dx + f f(x) dx = f fx) dx 


to hold without restricting the relative positions of a, b, c, we have 
to define 


f’ fdz 


as well for a = b by the formula 


(42a) f fiz) dx = — f f(x) dx 


(see Volume I, p. 136). Geometrically, the ordered pair of numbers 
a, b determines an oriented interval J* on the x-axis with “initial” 
point a and “final” point b. Here the value of 


(42b) f f dx =Í, fdx 


is the one given by the limit of Riemann sums (which is positive for 
positive f) when the orientation of J* corresponds to the sense of 
increasing x, that is, for a < b. It is the negative of that limit for 
a > b. Interchanging the end points of I* converts J* into the in- 
terval —I*, with the opposite orientation, so that formula (42a) can 
also be written as 


(42c) 4 fdx=— Í, , f dx, 


A similar situation holds for the integral over an oriented (Jordan- 
measurable) set R* in the x y,-plane.! When R* is oriented positively 
with respect to x, y-coordinates, 2 (R*) = Q (x, y), the double integral 


1Orientation of R* is defined here in accordance with the general definition of orien- 
tation of surfaces. It is determined by associating witheach point of R* an orientation 
(described, for example, by a pair of vectors), the orientations varying continuously 
from point to point. For a connected set only two distinct orientations are possible. 
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ff , F(x, y) dx dy 
R 


is to be understood in the sense defined in Chapter 4. That is, the 
integral is the limit of sums obtained from subdivisions of the plane 
into squares of area 2-2”, The integral will have a nonnegative value 
for nonnegative f. In case Q(R*) = —Q(x, y) = Q(y, x), we define 
the integral of f over R* by 


J. f dx dy = — || | f dy dx, 


where now 
f f dy dx 
R* 


has the ordinary meaning as the limit of sums. As a consequence, we 
have the rule that 


(43) I. fdzdy =- ff fax dy, 


where -R* is obtained by changing the orientation of R*. With this 
convention the substitution rule [see (16b), p. 403], in the form 


(43a) |f Ne, 9) de dy = fS, fau, v, wu, w FE du do, 
holds for smooth 1-1 mappings 


x = ġ(u, v), y = y(u, v) 


of T* onto R* as long as the Jacobian d(x, y)/d(u, v) is either posi- 
tive throughout T* or negative throughout T*. Here the orientation 
of T* has to be the one corresponding to that of R* under the map- 
ping.! If, for example, Q(R*) = — Q(x, y) and if d(x, y)/d(u, v) < 0, 


1In order to find that orientation, we form, in accordance with (40 o, p), the vectors 
Xu = (Xu, yu), Xv = (Xv, Yo) 
and put 


Q(R*) = £ XXu, X.) = | sgn” 
Yu 


Xv 
Yoy 


Q(x, y). 
y 


where £ = + 1 has the value determined by 
Q (R*) = Q(T*) = eQ(u, v). 


592 Introduction to Calculus and Analysis, Vol. IT 


then Q(T*) = Q(u, v). We might say that the orientation of R* 
attributes a certain sign to the differential form dx dy: the positive 
sign if the x, y-coordinate system has the orientation of R*, the 
negative one otherwise. The sign attributed by the orientation of 
T* to the form du dv is then the one that agrees with the relationship 


_ d(x, y) 
dx dy = d(u, v) du dv. 


In the same way we can define triple integrals 
I. f(x, y, z) dx dy dz 
over oriented sets in x, y, 2-space and similarly in higher dimensions. 


b. Surface Integrals of Second-Order Differential Forms 


We can now give a general definition for the integral of any 
second-order differential form @ over an oriented surface S* in space. 
Let œ be given by the expression 


(44) © = a(x, y, z) dy dz + b(x, y, z) dz dx + c(x, y, z) dx dy. 


Assume first that the whole surface S* under consideration can be 
represented parametrically in the form 


(45) x = x(u, v), y = y(u, v), z = z(u, v), 


with (u, v) varying over a set R* in the u, v-plane. Here R* has a cer- 
tain orientation determined by that of S* (see p. 581). 
We can write ® in the form 


o = K du dv, 
where 
œ _ dy, 2) d(z, x) d(x, y) 
(46) K = Ju do” * dlu v) © ° dlu, v) © € du, v) 
and define 


1The rule for orienting R* is as follows: Q(R*) = eQu, v) with e = + 1 if Q(S*) = 
6Q(Xu, Xv), where X = (x, y, z) is the position vector. 
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(46a) If. oO = ff. K du dv 


7 Uy, 2) de, ,  d(x,y) 
= Sf. (0 oe S + OG) + Oday) ae a 


The value obtained in this way for the integral of œ over the oriented 
surface S* is independent of the particular parametric representation 
for S*. If the surface can also be referred to parameters u’, v', we have 
(see p. 308) 


o = K' du’ du’ 

where 
, d(u, v) 
K = Kaw vy: 


The orientation of the region of integration R”™ in the wu’, v’-plane is 
then such that the substitution rule (43a) applies and 


ff. Kdudv= (|, K 4») du’ dv’ =| K' du’ dv’ 
d(u’, v R* 


Let, for example, S* be representable nonparametrically in the 
form z = f(x, y) with (x, y) varying over the vertical projection R* 
of S* onto the x, y-plane. The orientation of S* determines an orien- 
tation for R*. The orientation of S* can be described by specifying the 
normal of S* that points to the positive side of S*, when the orien- 
tation of space is that of the x, y, z-coordinate system. When that 
normal forms an acute angle with the positive z-axis, the orientation 
of R* is that of the x, y-system, otherwise that of the y, x-system.! In 
either case we have 


J= fa (a dy dz + b dz dx + c dx dy) 
= Sf (c — afz — bfy) dx dy. 


It is now easy to get rid of the special assumption that the whole 
surface S* can be represented by means of a single parametric repre- 


1See p. 578. In the first case with S* referred to the parameters x, y the positive 
normal € has the direction of the vector (—fz, —fy, 1), and thus, det (6, Xu, X») > 0. 
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sentation. We assume that the oriented surface S* can be divided into 
a finite number of oriented portions S,*, S,*, . . ., Sy*, in such a way 
that each portion has a parametric representation of the kind dis- 
cussed. We form the surface integral of the form œ for each of the 
portions according to the definition above, and define the integral of 
œ over S* as the sum of the integrals over the S;*. One has to show, 
of course, that the integral over S* defined in this way does not 
depend on the particular subdivision of S* into portions S;*. For 
the exact assumptions needed for this to be true and the proof, see the 
Appendix to this chapter. 


c. Relation Between Integrals of Differential Forms over Oriented 
Surfaces to Integrals of Scalars over Unoriented Surfaces 


In Chapter 4 (p. 424) we introduced the area A of a surface S in 
space without any reference to its orientation. If S has the parametric 
representation 


x = x(u, v), y = WU, v), z = z(u, v) 
and if €, n, & denote the components of the normal vector 


_ d(y, 2) _ dz, x) ,_ d(x,y) 
(46b) E= Iu y "= duy >> duo 


[see (30a) p. 428], the area of S is given by 
A= AE l 
J|, VEFFE du dv 


Here the integral is extended over the set R in the u, v-plane cor- 
responding to S. The integral is understood in the original sense of 
a double integral in which the surface element 


dS = v&2 + n? + C2 du du 


is treated as a positive quantity or, equivalently, in which R is given 
the positive orientation with respect to the u, v-system.! Orientability 


1If we introduce the position vector X = (x, y, z), the quantity /&2 + n? + C? re- 
presents the length of the vector product of the vectors X, and X». By (30b), p. 428, 
it can also be written as 

VEG — F? = V Xu + Xu) Xo © Xv) — (Ku © Xv)? = V [Xu Xo; Xu, Ko. 
The differential dS has the same invariance properties as a second order alternating 
differential form under parametric substitutions with positive Jacobian but changes 
sign under substitutions with negative Jacobian. 
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of S is not essential for the definition of A. The reader can, for ex- 
ample, easily express as an integral the total area of the unorientable 
Möbius band with the parametric representation given on p. 583. 

More generally, for a function f (x, y, z) defined on the surface S, 
we can form the integral of f over the surface: 


(47a) IJ, tas= ff f JEF a2 + @ du dv. 


The value of the integral is independent of the particular parameter 
representation used for S and does not involve any orientation of 
S. It is positive for positive f. 
In order to relate the integral of a second-order differential form 
@ = a(x, y, z) dy dz + W(x, y, z) dz dx + c(x, y, z) dx dy 

over an oriented surface S* to the surface integrals of functions over 
the unoriented surface S as defined just now, we introduce the direc- 
tion cosines of the positive normal of S* 


EE EN eC 


Pens OS = ps 


cosa = Text yz 4 ca? COSR = easy 


where €, n,¢ are given by (46b), and £ = +1, Q(S*) = Q(X., X,). 
Then, by (46), 


K = z; = € (a cos a + b cos B + c cos y) VEE Fa TEE 

Now, by (46a), | 

J 2 = Sf; £ du dv = e ff K du dv. 
Consequently, (47a) yields the identity 
47b O = dy dz + b dz dx + c dxd 
aw ffa Jf a dy dz z dx + c dx dy 

= ff (a cos a + b cos B + c cos y) dS 

S 


= |Í (a cosa + bcos B + c cos y) VE Fna FC du dv, 
R 
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which expresses the integral of the differential form œ over the 
oriented surface S* as an integral over the unoriented surface S or 
over the unoriented region R in the parameter plane. Here, however, 
the integrand depends on the orientation of S*, since cos a, cos 6, 
cos y are the direction cosines of that normal n of S* that points to 
the positive side of S* (using a positive space orientation with respect 
to x, y, 2-coordinates). 

If the oriented surface S* consists of several portions Sx* each of 
which permits a parametric representation of the form (45), we apply 
identity (47b) to each portion and, by addition over the different por- 
tions, obtain the same identity for the integral of œ over the whole 
surface S*. 

The direction cosines of the normal n pointing to the positive side 
of S* can be identified with the derivatives of x, y, z in the direction 
of n: 


dy 


dz 
cos Q = cos $ = 7 cos Y = Tr 


dx 
dn’ 


Thus, 


(470) ff, o= ff apto c Se) as. 


In vector notation the formula reduces to 
(47d) J2 = SJ V- nds, 


where n = (cos a, cos ß, cos y) is the unit normal vector on the posi- 
tive side of S*, and V the vector with components a, b, c. 

The concept of surface integral can be interpreted intuitively in 
terms of the flow of an incompressible fluid (this time in three dimen- 
sions) whose density we take as unity. Let the vector V = (a, b, c) 
be the velocity vector of this flow. Then at each point of the surface 
S* the product V - n gives the component of the velocity of flow in the 
direction of the normal n to the surface. The expression 


V-ndS=(acosa+ b cos B + c cos y) dS 


can therefore be identified with the amount of fluid that flows in unit 
time across the element of surface dS from the negative side of S* 
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to the positive side (this quantity may, of course, be negative).! The 
surface integral 


(48) [Ju @ dy dz + b dz du + c dx dy) = |] V -n as 


therefore represents the total amount of fluid flowing across the 
surface S* from the negative to the positive side in unit time. We 
notice here that an important part is played in the mathematical 
description of the motion of fluid by the distinction between the 
positive and negative sides of a surface, that is, by the introduction 
of orientation. 

In other physical applications the vector V denotes the force due to 
a field acting at a point (x, y, z). The direction of the vector V then 
gives the direction of the lines of force and its magnitude gives the 
magnitude of the force. In this interpretation the integral 


Ne (a dy dz + b dz dx + c dx dy) 


is called the total flux of force across the surface from the negative to 
the positive side. 


5.9 Gauss’s and Green’s Theorems in Space 


a. Gauss’s Theorem 


The concept of surface integral leads to an extension to three 
dimensions of Gauss’s theorem, which we proved on p. 545 for two 
dimensions. The essential point in the statement of the theorem in 
two dimensions is that an integral over a plane region is reduced to 
a line integral taken around the boundary of the region. We now 
consider a closed bounded three-dimensional region R in x, y, z-space 
bounded by a surface S that 1s intersected by every parallel to one of 
the coordinate axes in, at most, two points. This last assumption will 
be removed later. 

Let the three functions a(x, y, z), b(x, y, z), c(x, y, z) and their 
first partial derivatives be continuous in R. We consider the integral 


1See the analogous two-dimensional interpretation on. p 570. We think here of the 
surface in the neighborhood of a point as approximated by a plane piece of area AS 
and of the velocity vector V as replaced by a constant vector. A suitable passage to 
the limit furnishes the integral representation for the amount of liquid crossing S*. 
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S, Pele y, 2) > 2) dx dy dz 


taken over the region R, oriented positively with respect to x, y, 2- 
coordinates. The region R can be described by inequalities 


zo(x, y) S z < zi(x, y), 


where (x, y) varies over the projection B of R onto the x, y-plane. We 
assume that B has an area and that the functions zo (x, y) and 21 (x, y) 
are continuous and have continuous first derivatives in B. We can 
transform the volume integral over R by means of the formula (see 


p. 531) 
[lf fax ay az= ff dx dy f f dz. 


Since here f = dc/dz the integration with respect to z can be 
carried out, yielding 


Z 
f” 2c dz = (x, y, zı) — C(x, y, zo) = C1 — Co, 
20 Z 


so that 


ff elx, Y, 2) gy dy dz = ff cı dx dy — ji co dx dy. 
R oz B B 


If we assume that the boundary S is positively oriented with respect 
to the region R, then the portion of the oriented boundary surface 
S* consisting of the points of entry z = zo(x, y) has a negative orien- 
tation with respect to x, y-coordinates when projected on the x, y- 
plane,! while the portion z = zı (x, y) consisting of the points of exit 
has a positive orientation. Hence, the last two integrals combine to 


form the integral 
[Je © (9 2) de dy 
taken over the whole surface S*. We thus obtain the formula 


ff ðe (x, Y, 2) dy dy dz = f c (x, y, z) dx dy. 
R dz S 


1See p. 593. On z = zo(x, y) the positive normal (the one exterior to R) points down- 
ward. 
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The formula remains valid if S* contains cylindrical portions 
perpendicular to the x, y-plane, for these contribute nothing to the 
integral. If, for example, such a portion S’* of S* has the representa- 
tion y = ¢ (x), we have for S’* the parameter representation 


x=Uu, y=), Z=0 


and, thus, indeed 


ff c dx dy = ff fa) y) du dv = 


If we derive the corresponding formulae for the components a and 
b and add the three formulae, we obtain the general formula 


(49) f f f Pena y, 3) + Be 2 z) 4 ae, cs 2 dx dy dz 


du dv = 0. 


= f „x (ale, y, 2) dy dz + B(x, y, 2) dz dx + c(x, y, 2) dx dy 


which is known as Gauss’s theorem. Using formula (47b) of p. 595, 
we can also write this in the form 


(50) {ff (az + by + cz) dx dy dz 


= [J (a cos a + b cos B + c cos 7) ds 


= ffi (agt +092 + + dS. 


Here, corresponding to the positive orientation of S* with respect 
to R, we have in a, P, y the angles the outward-drawn normal n makes 
with the positive coordinate axes. 

This formula can easily be extended to more general regions. We 
have only to require that the region R be capable of being subdivided 
by a finite number of portions of surfaces with continuously turning 
tangent planes, into subregions Ri each of which has the properties 
assumed above (in particular, that each R; has a boundary consisting 
of surfaces that are either intersected by every parallel to a coordinate 
axis in, at most, two points or are portions of cylinders with gener- 
ators parallel to one of the coordinate axes). Gauss’s theorem holds 


600 Introduction to Calculus and Analysis, Vol. II 


for each region R;. On adding, we obtain on the left a triple integral 
over the whole region F; on the right, some of the surface integrals 
combine to form the integral over the oriented surface S, while the 
others (namely, those taken over the surfaces by which R is sub- 
divided) cancel one another, as we have already seen in the case of 
the plane (p. 549). 

As a special case of Gauss’s theorem, we obtain the formula for the 
volume of a region R bounded by a surface S* oriented positively with 
respect to R. If, for example, we put in (49) a = 0, b = 0, c = z, we 
immediately obtain the expression 


v= fff dx dy dz = || , z dx dy 


for the volume. In the same way, we find? that 


v= ff. x dy dz = JJ.» 9 dz dx. 


If A is the vector with components a, b, c, we have in az + by + cz 
the divergence of A, and in 


1The proof for general R that we have given here makes use of a definition of integral 
over a closed surface S that has actually not been shown to be independent of the 
particular way in which S is divided into portions with simple parameter represent- 
ations. The proof that for smooth S the integral over S is independent of the sub- 
division will be given in the Appendix, p. 635. In the extension of Gauss’s theorem 
to more general regions R given above, however, we necessarily make use of sub- 
regions R: bounded by surfaces S; that have edges and are not perfectly smooth. For 
that reason, it is more convenient to use a quite different technique of proof that 
does not involve decomposition of R into disjoint subsets Ri, which cannot possibly 
have smooth boundaries. This is achieved by the method of partition of unity, in 
which, effectively, R is represented as union of overlapping regions R; with smooth 
boundaries, to each of which the theorem applies directly. See the Appendix to this 
chapter, pp. 639-642. 

2It is noteworthy that cyclic interchange of x, y, z in these expressions for V brings 
about no change in sign, in contrast to the corresponding formulae for the area of a 
two-dimensional region bounded by an oriented curve C*: 


A=|[,.xdy=—| .yde 


This is so because in two dimensions an interchange of the positive x-direction with 
the positive y-direction reverses the orientation of the plane: Q(x, y) = —Q(y, x), 
while a cyclic interchange of coordinates in three-space preserves the orientation of 
space: 


Xx, y, z) = Ay, z, x) = Q(z, x, y). 
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dx dy dz 
7 dn + 6 dn te dn 
the scalar product of the vectors A and n, that is, the normal com- 
ponent An of the vector A. Hence, in vector notation Gauss’s theorem 
becomes! 


(52) S, div A dx dy dz = || A-nas= f| An dS. 


More striking is the formulation of the Gauss’s theorem (49) in 
terms of exterior differential forms. The second-order differential form 


o = a(x, y, z) dy dz + b(x, y, z) dz dx + c(x, y, z) dx dy 
just has as its derivative [see (58c), p. 313] the third-order form 
dœ = (az + by + cz) dx dy dz. 


Denoting by S* the boundary of R oriented positively with respect to 
R, we have simply 


(53) J, de = ff o. 


Heretofore we have made the assumption that the three-dimensional 
region R is oriented positively with respect to x, y, z-coordinates. 
We can free ourselves from this assumption by observing that œ in 
(53) stands for an arbitrary second-order differential form and that the 
relation between o and do is independent of coordinates used. Denote 
by R* an oriented region in space and by 0R* its boundary oriented 
positively with respect to R*. We can always choose an x, y, 2-system 
with respect to which R* is oriented positively, so that (53) holds 
with S* = dR* (see p. 591). With these conventions we have for any 
orientation of R* 


(58a) MWe d= ffe 


1Notice that in the surface integrals the orientation given to S only affects the 
integrand. 
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Precisely analogous formulae hold more generally for sets of 
any number of dimensions, as we shall see.! 


Exercises 5.9a 


1. Evaluate the surface integral 


[JF as 


taken over the half of the ellipsoid x?2/a? + y?/b2 + z?/c? = 1, for which z 
is positive where 1/p = Ix/a? + my/b? + nz/c?, l, m, n being the direction 
cosines of the outward-drawn normal. 
2. Evaluate the surface integral 
Í ji HdS 


taken over the sphere of radius unity with center at the origin, where 
H = aix* + azy4 + aszt + 3aax?y? + 3asy?z? + 3asx?z?. 


b. Application of Gauss’s Theorem to Fluid Flow 


As in the case of the plane, we can obtain a physical interpretation 
fo Gauss’s theorem in space by taking the vector A = (a, b, c) as the 
momentum vector in the flow of a fluid of density p whose velocity is 
given by the vector V = (u, v, w). Here p and the velocity components 
u, v, w depend on the (x, y, z) and the time ¢ considered. The momentum 
vector (per unit volume) is defined by A = pV. If R is a fixed region 
in space bounded by the surface S, then the total mass of fluid that 
in unit time flows across a small portion of S of area AS from the 
interior to the exterior of R is given approximately by the expression 
oVn AS, where Vn is the component of the velocity vector V in the 
direction of the outward normal n at a point of the surface element. 
Accordingly, the total amount of fluid that flows across the boundary 
S of R from the inside to the outside in unit time is given by the 
integral 
iGenerally, for an n-dimensional oriented set R* in euclidean space of n or more 
dimensions the symbol əR* denotes the boundary of R* oriented positively with re- 
spect to R*; that is, 3R* is oriented in such a way that 

Q(R*) = Q(B, Al, ++», A™}) 
where A1, . . ., A”-1 are vectors tangential at some point to the boundary of dR*, 
with 

QAR) = Q(A}, A2, «++, An), 
and where B is a vector tangential to and pointing away from R*. 
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JJ. pVadS = ff An dS 


taken over the whole boundary S. By Gauss’s identity (52) the amount 
of fluid leaving R in unit time through its boundary is thus: 


S, div A dx dy dz = Í ii ii div (pV) dx dy dz. 


On the other hand, the total mass of fluid contained in R at any one 
time is given by the triple integral 


S, p(x, y, z, t) dx dy dz 


and the decrease in unit time of the mass of fluid contained in R by 


t R R 


If the law of conservation of mass is to hold and if there are no sources 
or sinks of mass in R, then the total amount of mass of fluid leaving 
R through the surface S must be exactly equal to the loss of mass of 
fluid contained in R. We must then have 


S, div (pV) dx dy dz = — J, p: dx dy dz 


at any time ¢ for any region R. Dividing both sides of this identity by 
the volume of R and shrinking R into a point (that is, applying space 
differentiation), we obtain the three dimensional continuity equation 


div (pV) = — pt 
or 
Op , Apu) , A(pv) , A(pw) _ 
(55) 31 + oy +t ay +>, =0, 


which expresses the law of conservation of mass for motion of fluids 
in the form of a differential equation 
If the law of conservation of mass is not invoked, the expression 


p: + div (pV) 
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measures the amount of mass created (or annihilated, when negative) 
in unit time per unit volume. 

Particular interest attaches to the case of a homogeneous and 
incompressible fluid, for which the density p has the same value in all 
places and is unchanging with time. Since p is then constant, we 
deduce from (55) that 


ðv OW 
(56) div V = met ay toe = 0 


if mass is to be preserved. It then follows from (52) that 
57 V-ndS=0 
(57) ven 


whenever the surface S bounds a region R. Consider, in particular, 
two surfaces Si and Sz bounded by the same oriented curve C* in 
space, and together forming the boundary S of a three-dimensional 
region R. We find from (57) that 


(58) o= {J v-nds= {J v-nds+ [J v-nds, 


where, on both Si and S2, n denotes the normal pointing away from 
R. We can make both Sı and Sz into oriented surfaces Si*and S2*in 
such a way that the orientation of C* is positive with respect to both 
Si* and S2*. On both these surfaces, let n* be the unit norma! pointing 
to the positive side. (For a right-handed orientation of space, this 
means that n* points to that side of the surface from which the orien- 
tation of C* appears ounterclockwise.) Then, necessarily, n* = n on 
one of the surfaces Si, S2 and n* = -n on the other.! It follows from 
(58) that 


(59) {J Veras = J ventas 


In words, if the fluid is incompressible and homogeneous and mass is 
conserved, then the same amount of fluid flows across any two surfaces 


1The normal n determines an orientation on the whole surface S if we require, for 
example, that n points to the positive side of S. Orienting Sı and Sz relative to n, the 
curve C receives opposite senses if we require it to be oriented positively with 
respect to Sı or to S2 (see p. 588). However, since C* has the positive sense with 
respect to both Sı* and S:*, it follows that the orientations given by n* and by n 
agree only on one of the surfaces. 
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with the same boundary curve C* that together bound a three-dimen- 
sional region in space. This amount of fluid does not depend on the 
precise form of the surfaces; it is plausible that it must be determined 
by the boundary curve C* alone.! We then ask how we can express the 
amount of fluid in terms of the curve C* alone. This question is 
answered in the next section (p. 614) by means of Stokes’s theorem. 


c. Gauss’s Theorem Applied to Space Forces and Surface Forces 


The forces acting in a continuum may be regarded either as space 
forces (such as gravitational attraction, electrostatic forces) or as 
surface forces (such as pressures, tractions). The connection between 
these two points of view is given by Gauss’s theorem. 

We consider only the special case of the force in a fluid of density 

= p(x, y, z), in which there is a pressure p(x, y, z), which in general 
depends on the point (x, y, z). This means that the force acting on a por- 
tion R of the liquid exerted by the remaining part of the liquid can 
be considered as a force acting at each point of the surface S of R 
in the direction of the inward drawn normal and of magnitude p per 
unit surface area. Denoting by dx/dn, dy/dn, dz/dn the direction 
cosines of the outward-drawn normal at a point of the surface S of R, 
the components of the force per unit area are given by 


_,dx  —_ dy _ dz 
P an’ P an’ Pan’ 


Thus, the resultant of the surface forces acting on R is a force with 
components 


=- |f »§ fas, Y=- ff.» D dS, Z=- Jf. eas, 


By Gauss’s theorem (50), p. 599, we can write X, Y, Z as volume 
integrals 


— J|, Pz dx dy dz, Y = — [IJ py dx dy de, 


- {fy pz dx dy dz. 


In vector notation the resultant is a force F given by 


1The amount of fluid crossing asurface bounded bythe closed curve Cin unit time is 
independent of time if we make the further assumption that the flow is steady, that is, 
that the velocity vector V is independent of time. 
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(60) F = — [|f grad p dx dy dz. 
R 


We can express this result as follows. The forces in a fluid due to 
a pressure p(x, y, z) may, on the one hand, be regarded as surface 
forces (pressure) that act with density p(x, y, z) perpendicular to each 
surface element through the point (x, y, z) and, on the other hand, 
as volume forces, that is, as forces that act on every element of 
volume with volume density — grad p. 

If a fluid is in equilibrium under the forces due to pressure and to 
gravitational attraction, the vector F must balance the total at- 
tractive force G acting on the liquid contained in R: 


F+G=0. 


If the gravitational force acting on a unit mass at the point (x, y, z) 
is given by the vector I(x, y, z), we have 


G =f. [Tp dx dy dz. 


From the relation F + G = 0, valid for any portion R of the fluid, 
we conclude by space differentiation that the corresponding relation 
holds for the integrands, that is, that at each point of the fluid the 
equation 


(61) —grad p + pI =0 


holds. Since the gradient of a scalar is perpendicular to the level 
surfaces for that scalar, we conclude that for a fluid in equilibrium 
under pressure and gravitational attraction the attraction at each point 
of a surface of constant pressure p (‘‘isobaric”’ surface) is perpendicular 
to the surface. If we make the customary assumption that the gravita- 
tional force per unit mass near the surface of the earth is given by the 
vector I = (0, 0, —g), where g is the gravitational acceleration, we 
find! from (61) that 


(62) Pz =0, py=0, pz= —gp. 


Consider in particular a homogeneous liquid of constant density 
p bounded by a free surface of pressure 0. Along this free surface, we 


1This formula was derived in Volume I (p. 226), in the description of the pressure 
variations in the atmosphere. 


Relations Between Surface and Volume Integrals 607 


have, by (62), 
0 = dp = pz dx + py dy + pz dz = —gp dz. 


Hence, dz = 0, which means that the free surface has to be a plane 
z = constant = zo. For any point (x, y, z) of the liquid the value of 
the pressure is then 


P(x, y, z) = — f i pAx, y, )do = gp (zo — 2). 


Thus, at the depth zo — z = h the pressure has the value gph. For a solid 
partly or wholly immersed in the liquid, let R denote the portion of 
the solid lying below the free surface z = zo. We apply formula (60) 
to the region R in order to determine the total pressure force acting 
on the solid.! We find from (60) and (62) that the resultant of the 
pressure forces acting on the solid is equal to a force (buoyancy) with 
components 


xX = 0, Y = 0, z= gp dx dy dz; 


this force is directed vertically upward and its magnitude is equal 
to the weight of the displaced liquid (Archimedes’ principle). 
d. Integration by Parts and Green’s Theorem in Three Dimensions 


Just as in the case of two independent variables (p. 556), Gauss’s 
theorem (50), p. 599 applied to products au, bv, cw leads to a formula 
for integration by parts: 


(63) {ff (auz + buy + cwz) dx dy dz 
R 
= f au © + bv + cw Z dS 
s dn dn dn 
— SIS, (azu + byv + czw) dx dy dz. 


If here u = v = w = U and if a, b, c are of the form a = Vz, b = Vy, 
c = Vz for some scalar V, we obtain Green’s first theorem 


1Any portions of the boundary of R lying in the plane z = zo make no contribution 
since there p = 0 by assumption. 
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(64) I (UrV2 + UyVy + UzV;) dx dy dz 


= ff. v ods — I U AV dx dy dz. 


Here we use the familiar symbol A for the Laplace operator defined by 
AV = Vex + Vyy + Vzz 


and denote by dV/dn the derivative of V in the direction of the out- 
ward normal: 


dV_yd y dy, y dz 
dn = Vz dn | Vo Gn dn | Vz dn 


Interchanging U and V in formula (64) and subtracting from (64) 
yields Green’s second theorem 


(65) {ff (UAV — VAU) dx dy dz = || (u - v) as. 


e. Application of Green’s Theorem to the Transformation of AU to 
Spherical Coordinates 


If we set V = 1 in Green’s theorem (65), we obtain 


(66) ff AU dx dy dz = Í ae ds = |f (grad U)-n dS. 
R s an S 


Just as in the plane, we can use this formula to transform AU to 
other coordinate systems, notably to the spherical coordinates r, ¢, 
8 defined by 


x =r cos¢sin 9, = r sin ø sin 9, z = r cos 9. 


We apply formula (66) to a wedge-shaped region R described by in- 
equalities of the form 


(67) m<r<re, d<¢< ¢2, 01<0< b2. 


The boundary S of R consists of six faces along each of which one 
of the coordinates r, ¢,9 has a constant value. Applying the formula 
for transformation of triple integrals we write the left side of equation 
(66) in the form 


Relations Between Surface and Volume Integrals 609 


(68) If AU dx dy dz = ff sue es dr dð dé 


= ff AU r? sin 9 dr d9 d@, 


with the integral in r, 0, d-space extended over the region (67). In order 
to transform the surface integral in (66) we introduce the position 
vector 


X = (x, y, z) = (r cos ¢ sin9, r sin ø sin 9, r cos 0) 
and notice that its first derivatives satisfy the relations 
(68a) Xr > Xo = 0, Xo» Xd = 0,7 X; X, = 0 
(68b) Xr °: Xr = 1, Xo + Xo = r°, X;X; = r? sin?0. 


It follows from these relations that at each point the vector X; is nor- 
mal to the coordinate surface r = constant passing through that 
point, the vector Xe normal to the surface 0 = constant, and the 
vector X; normal to the surface ¢ = constant. More precisely, on 
one of the faces r = constant = r; (where i has either the value 1 
or 2) the outward normal unit vector n is given by (-1)'X,. Hence, 
on those faces 


(grad U) - n = (—1)' (grad U) - X, = (—1)! a 


Using, moreover, 0 and ¢ as parameters along a face r = ri, we have 
for the element of area the expression [see (30e), p. 429] 


dS = VEG — F? dð dg = V(Xo- Xe) (X; + Xs) — (Xo - Xe)? db do 
= r? sin 8 dé dg. 


It follows that the contribution of the two faces r = rı and r = re to 
the integral of dU/dn over S is represented by the expression 


. ,0U . 0U 
2 ov _ 2 ov 
ii r? sin 0 ar dð dø a r? sin @ ar dð dg, 


where the integrations are taken over the rectangle 


01 <0 <2, gı << go. 
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We can write the difference of these integrals as the triple integral 


ME (8 sin 0 2 dr dð dé 


extended over the region (67). 
Similarly, we find that on a face 0 = constant = 0; 


1 . dU (—1)§ dU 
— (1% = — oU 
n = (—1) 7 Xo, dS = r sin 8 d¢ dr, dan r 00 
and on a face ¢ = constant = ¢; 
1 dU _ (—1% aU 
n = (1 —— X J = QU _ 
(—1) rsin 0°” d r dr db, dn rsin® ôg 


Here also, combining the contributions of opposite faces 0 = constant 
or ¢ = constant, we find for the total surface integral the expression 


ð aU 
2 
ff Se dU ag = ME r sin 0 ac) + + lsin 6 a | 
a( 1 aU 
0¢\sin 0 35) 
Comparing with the expression (68), dividing by the volume of the 


wedge R, and shrinking the wedge to a point leads to the desired 
expression for the Laplace operator in spherical coordinates: 


l dr dð dg. 


_ 1 0/5 9U S [si oU 0/1 aU 
(69) AU = r? sin 8 bl sin 9 3r ) + 30 J0 |SP 95 7} + slain 0 3ğ alt 
Exercises 5.9e 
1. Let the equations 
xi = xi (Pı, P2, P3) , (i = 1, 2, 3) 


define an arbitrary orthogonal coordinate system pı, p2, p3; that is, if 
we put aik = ie then the equations 

@11021 + Gi2d22 + ai3a23 = 0 

@11@31 + d12d32 + disd33 = 0 

G21031 + 22032 + a23a33 = 0 


are to hold. 
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(a) Prove that 
x1, Xa X38) _ aa 
0(p1, P2, P3) 


where 
ei = Ari? + Aa? + a3:?. 
(b) Prove that 


0 
Opi 1 Oxe lan. 
OxXr ei Opi ei 


(c) Express Au = Uzrızı + Uzr + Uxgxzg in terms of pi, p2, p3, using 
Gauss’s theorem. 


(d) Express Au in the focal coordinates f1, t2, t3 defined in Exercises 9, 
Section 3.3d, p. 256. 


5.10 Stokes’s Theorem in Space 


a. Statement and Proof of the Theorem 


We have already seen Stokes’s theorem in two dimensions (p. 554). 
The analogous theorem in three dimensions connects the integral of 
the normal component of the curl of a vector over a curved surface 
with the integral of the tangential component of the vector over the 
boundary curve of the surface. While in two dimensions Gauss’s 
theorem and Green’s theorem go over into each other by a change in 
notation, they are essentially different theorems in three dimensions. 

Let S be an orientable surface in three-space bounded by a closed 
curve C. The choice of an orientation for S converts S into the ori- 
ented surface S*. Let C* be the boundary curve of S* oriented posi- 
tively with respect to S*. Assuming that space is oriented positively 
with respect to x, y, z-coordinates, let n at each point of S* denote 
the unit normal vector! pointing to the positive side of S*. Let t be the 
unit tangent vector on C* pointing in the direction corresponding to 
the orientation of C*. Let A = (a, b, c) be a vector defined near S. 
Stokes’s theorem asserts? that 


(70) JJ. (curl A) -ndS = f A-tds. 


1Jn effect this means that when we move a point of S* into the origin in such a way 
that n coincides with the positive z-axis, the sense of rotation on S* will be that of 
the 90° rotation taking the positive x-axis into the positive y-axis. 

Precise regularity assumptions for S, C, A under which the theorem can be proved 
are given in the Appendix to this chapter, p. 643. 
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Denoting by dx/dn, dy/dn, dz/dn the components of the vector n 
and by dx/ds, dy/ds, dz/ds those of t, we write Stokes’s theorem in the 
form! 


(71) f f. ke — bs) dz + (az — ca) a + (bz — dy) A dS 


o (hE, pW a d 
= f| (e+ oR + eF) as. 


Using formula (47c), p. 596, we have, equivalently, 
(12) ff „(cv — bz) dy dz + (az — cz) dz dx + (bz — ay) dx dy 
S 
= f „adx + bdy + cdz. 
c 


Introducing the first-order differential form 

(73a) L = a dx + b dy + c dz 

and 

(73b) © = (cy — bz) dy dz + (az — cz) dz dx + (bz — ay) dx dy, 
we notice (see p. 313) that œ is just the derivative of L: 

(73c) o = dL. 


If 0S* is the positively oriented boundary C* of S*,? Stokes’s theorem 
becomes simply 


(74) We dL=f „L. 


In this form it is completely analogous to Gauss’s theorem as written 
in formula (53), p. 601. 

The truth of Stokes’s theorem can immediately be made plausible 
from the fact that the theorem has already been proved for plane 
surfaces [see formula (10), p. 555]. Consequently, if S is a polyhedral 
surface composed of plane polygonal surfaces, so that the boundary 


1See (94c), p. 209 for the definition of the curl of a vector. 
2This accords with the general definition in footnote 2, p. 587, for the case n = 2. 
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curve C is a polygon, we can apply Stokes’s theorem to each of the 
plane portions and add the corresponding formulae. In this process 
the line integrals along all the interior edges of the polyhedron cancel, 
and we at once obtain Stokes’s theorem for the polyhedral surface. 
In order to obtain the general statement of Stokes’s theorem, we only 
pass to the limit, leading from approximating polyhedra to arbitrary 
surfaces S bounded by arbitrary curves C. 

The rigorous validation of this passage to the limit, however, 
would be troublesome; therefore, having made these heuristic re- 
marks, we carry out the proof by transforming the whole surface S 
into a plane surface and by observing that the theorem is preserved 
under such transformations. 

We assume that there exists a parametric representation! 


x = ġ(u, v), y=wu,v), 2= x(u, v) 


for S, where ¢, y, x are functions with continuous first derivatives for 
which the vector with components 


_ Uy, 2) _ a2, x) _ Ax, y) 
(75) S=adu,v)? duy’ >> du,v) 
does not vanish. Assume that there is an oriented set >\* in the u, v- 
plane bounded by an oriented closed curve I* such that >\* is 
mapped bi-uniquely onto the surface S* and I’* onto C*.2 

Now L determines a differential form in du and dv: 


L =a (xu du + x» dv) + b (yu du + yy dv) + c (Zu du + z dv) 
= (axu + byu + c2u) du + (axy + byy + c2y) du 


and 


where on the right side we take L as expressed in terms of du and 
dv. Similarly, œ gives rise to a second-order form in du and dv, 


1In the Appendix to this chapter the theorem will be proved more generally for 
surfaces S that can be patched together from portions with a parametric represen- 
tation of the type mentioned. 

"If the vector (%. n, ¢) has the direction of n, we have Q(D>*) = Q(u, v); if (E, n, ©) 
has the direction of -n, we have 2(>>*) = —Q(u, v). The curve I* is oriented 
positively with respect to >)* in either case. See p. 587. 


614 Introduction to Calculus and Analysis, Vol. II 


o=- 
~ du du 


= [(cy — 62z)§ + (az — cz)q + (bz — ay)b] du dv, 


du du 


and again [see (46a), p. 593] 


{Js o = Js © 


Moreover, as we proved on p. 322, the relation œ = dL does not 
depend on the choice of independent variables x, y, z or u, v.1 Con- 
sequently, the proof of identity (74) has been reduced to the case, 
involving a first-order differential form L in du and dv and a region 
X* with boundary I* in the u, v-plane. Since Stokes’s theorem is 
known to hold in the u, v-plane, it now follows for the curved surface 
S. 

Stokes’s theorem answers the question raised on p. 0000. We have 
seen that for a given vector field V(x, y, z) with div V = 0, the integral 


J| V -nas 


over a surface S with unit normal n depends only on the boundary 
curve C of S and not on the particular nature of S. On the other hand, 
we found on p. 315 that a vector field V with vanishing divergence 
can be represented as the curl of a vector A = (a, b, c)—at least if 
we restrict ourselves to vector fields defined in a parallelepiped with 
edges parallel to the coordinate axes. Stokes’s theorem now enables 
us to express 


J V-nds= ff (curl A) + n dS 
in the form 
[ A- tds, 


which involves only the boundary curve C of S. 


1This can also be verified directly by proving the identity 
(Cy — bz)E + (az — cxz)N + (bz — ay) = (axy + byv + CZo)u — (axu + byu + CZu)v, 
where £, n, ¢ are defined by (75). 
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Exercises 5.10a 


1. Let 
I= ff «z dx dy — x dy dz 


where S* is the spherical cap x? + y? + 22 = 1, x > 1/2, oriented posi- 
tively with respect to the normal pointing to infinity. 

(a) Calculate J directly using y, z as parameters on S*. 

(b) Calculate J from Stokes’s formula (74), p. 612, observing that 


z dx dy — x dy dz = dL 
with 
L = —yz dx — xy dz. 


b. Interpretation of Stokes’s Theorem 


The physical interpretation of Stokes’s theorem in three dimen- 
sions is similar to that already given (p. 572) in two dimensions. 
Once again we interpret the vector field V = (vı, v2, v3) as the velocity 
field of the flow of a fluid. We call the integral 


[ V-tds=|., vı dx + vz dy + us dz 
c c 


taken for an oriented closed curve C* the circulation of the flow along 
this curve. Stokes’s theorem states that the circulation along C* is 
equal to the integral 


Í J. (curl V)-n dS, 


where S is any orientable surface bounded by C, and n is the unit 
normal on S chosen in such a way that the screw determined by n 
and the sense of rotation of C* has the same sense (right-handed or 
left-handed) as that of the x, y, z-system. Suppose we divide the cir- 
culation around C by the area of the surface S bounded by C and pass 
to the limit by letting C shrink to a point while remaining on the 
surface. This process of space differentiation gives for the limit of the 
double integral of the normal component of curl V divided by the 
area the value of (curl V). n at the limit point. We therefore see that 
the component of curl V in the direction of the normal n to the surface 
can be regarded as the specific circulation or circulation density of 
the flow in the surface at the corresponding point. 


1These considerations also show that the curl of a vector has a meaning independent 
of the coordinate system and therefore is itself a vector as long as the orientation of 
the coordinate system (and, hence, the vector n) is not changed. 
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The vector curl V is called the vorticity of the motion of the fluid. 
Thus, the circulation around a curve C is equal to the integral of the 
normal component of the vorticity over a surface bounded by C. The 
motion is called irrotational if the vorticity vector is 0 at every point 
occupied by the fluid, that is, if the velocity vector satisfies the 
relations 


OUZ 3V2 _ ðu 0V3 Ove dvi _ 


= —- = = —- St = <2 —- + =0 


dy əz ° Oz əx ° Ox oy 


As a consequence of Stokes’s theorem the circulation in an irrota- 
tional motion vanishes along any curve C that bounds a surface 
contained in the region filled by the fluid. 

If we interpret the vector V as the field of a mechanical or electrical 
force, the line integral 


Je Vitds 


represents the work done by the field on a particle when it is made to 
describe the curve C* in the sense indicated by its orientation. By 
Stokes’s theorem the expression for this work is transformed into an 
integral over the surface S bounded by C, the integrand being the 
normal component of the curl of the field of force. If here the curl of 
the force field vanishes, the work done on a particle returning to the 
same point is zero, and the field is called conservative. 

From Stokes’s theorem we obtain a new proof for the main theorem 
on line integrals in space (p. 104). The chief problem is to describe 
the nature of the vector field A = (a, b, c) if the integral 


[A-tds=fadx+ bdy + cdz 


is to vanish around an arbitrary closed curve C. Stokes’s theorem 
yields a new proof of the fact that the vanishing of the line integral 
is ensured if curl A = 0, provided C forms the boundary of a surface 
S contained in the region where A is defined. The vanishing of curl A 
—or, as we shall say, the irrotational nature of A—is therefore a 
sufficient condition for the vanishing of the line integral of the 
tangential component of A around any closed curve that bounds a 
surface S in the domain of definition of A. That the condition also 
is necessary we know already from p. 97. If the condition curl A = 0 
is satisfied, we can represent A as gradient of a function f(x, y, 2): 
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A = grad f. 


If we take A as the velocity vector V of a fluid flow, irrotationality 
of the flow, that is, the equation curl V = 0, in a simply connected 
region implies that there exists a velocity potential f(x, y, z) such that 


V = grad f. 


If, in addition, the fluid is homogeneous and incompressible, we have 
(see p. 604) the relation 


div V = 0. 
It follows in this case that the velocity potential f satisfies the equation 
0 = div grad f = Af = fsz + fuv + fez, 


which is Laplace’s equation, already met before. 


Exercises 5.10b 


1. Let 9, a, and b be continuously differentiable functions of a parameter 
t, for O <t < 2r, with a(2z) = a(0), b(27) = b(0), o(2x) = (0) + 2nr (n 
a rational integer), and let x, y be constants. Interpreting the equations 
E = x cos ọ — y sin ọ + aq, n = x sin p + y cosọ +b 
as the parametric equations (with parameter t) of a closed plane curve 
T, prove that 


= f. € dn — n dE) = A (x? + y9 + Bx + Cy + D 
where 


_1 _ 
A= | dp, B= | (acos¢+bsin ¢) de, 


— f (age — 4 — 
c=f < asin ọ + b cos 9)d9, D=-> f (adb b da). 


2. Let a rigid plane P describe a closed motion with respect to a fixed plane 
II with which it coincides. Every point M of P will describe a closed 
curve of II bounding an area of algebraic value S(M). Denote by 2nr 
(n a rational integer) the total rotation of P with respect to II. Prove the 
following results: 

(a) Ifn + 0, there is in P a point C such that for any other point M of 
P we have 


S(M) = mnM? + S(C); 


(b) If n = 0, then two cases may arise: first there is in P an oriented 
line 4 such that for every point M of P 
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S(M) = à d(M), 


where d(M) is the distance of M from 4 and Ais a constant positive 
factor; or, second, S(M) has the same value for all the points M of 
the plane P (Steiner’s theorem). 

3. A rigid line segment AB describes in a plane II one closed motion of a 
connecting-rod: B describes a closed counterclockwise circular motion 
with center C, while A describes a (closed) rectilinear motion on a line 
passing through C. Apply the results of the previous example to deter- 
mine the area of the closed curve in II described by a point M rigidly 
connected to the line segment AB. 

4. The end points A and B of a rigid line segment AB describe one full 
turn on a closed convex curve I. A point M on AB, where AM = a, 
MB = b, describes as a result of this motion a closed curve I’. Prove 
that the area between the curves T and I” is equal to rab (Holditch’s 
theorem). 

5. Prove that if we apply to each element ds of a twisted, closed, and rigid 
curve T a force of magnitude ds/p in the direction of the principal nor- 
mal vector (Chapter 2 p. 213). the curve T remains in equilibrium; 1/p 
is the curvature of T at ds and is supposed to be finite and continuous 
at every point of I’. (By the principles of the statics of a rigid body, we 
have to prove that 


f. 7 ds =o, [= ds = 0. 
r P r p 


where n denotes the unit principal normal vector of T at ds, and x is the 
position vector of ds.) 

6. Prove that a closed rigid surface = remains in equilibrium under a 
uniform inward pressure on all its surface elements. (If by n’ we denote 
the inward-drawn unit vector normal to the surface element do and by 
x the position vector of do, the statement becomes equivalent to the 
vector equations 


ff. n’ do = 0, J|. x xn’ ds = 0.) 


7. A rigid body of volume V bounded by the surface X is completely im- 
mersed in a fluid of specific gravity unity. Prove that the statical effect 
of the fluid pressure on the body is the same as that of a single force f 
of magnitude V, vertically upward, applied at the centroid C of the 
volume V. 

8. Let p denote the distance from the center of the ellipsoid È 


X yn] 
a b œ 


to the tangent plane at the point P(x, y, z) and dS the element of area 
at this point. Prove the relations 


(i) Í Í. p dS = 4nabe, 


10. 


11. 


12. 
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(ii) Í IS > dS =~ (bèc? + ca? + a?b%). 


. An ordinary plane angle is measured by the length of the arc that its 


sides intercept on a unit circle with center at the vertex. This idea can be 
extended to a solid angle bounded by a conical surface with vertex A 
as follows: The magnitude of the solid angle is by definition equal to the 
area that it intercepts on a unit sphere with center A. Thus, the meas- 
ure of the solid angle of the domain x = 0, y = 0, z = 0 is 47/8 = x2. 
Now let T be a closed curve, = a surface bounded by T, and A a fixed 
point outside both [ and Ł. An element of area dS at a point M of = 
defines an elementary cone with its vertex at A, and the solid angle of 
this cone is readily found by an elementary argument to be 


cos 8 gs 
r 


? 


where r = AM and 9 is the angle between the vector MÀ and the 
normal to 2 at M. This elementary solid angle is positive or negative 
according to whether 9 is acute or obtuse. Interpret the surface integral 


a= [lotus 


geometrically as a solid angle and show that 


a= ff (a — x) dy dz + (b — y) dz dx + (c — z) dx dy 
[(a — x)? + (b —y)? + (c — 2)? 9” 
where (a, b, c) and (x, y, z) are the Cartesian coordiantes of A and M, 
respectively. 
Prove, first directly and then by interpretation of the integral as a solid 


angle, that 
__dxdy | 
J. J. (x? + y? + 1)3/2 = an, 


Prove that the solid angle that the whole surface of the hyperboloid of 
one sheet (x?/a?) + (y2/b2) — (z2/c?) = 1 subtends at its center (0, 0, 0) 
is 
/2 a a 
n b? cos? » + a? sin? » d 
sc J Vap + b?c? cos? p + a?c? sin? 9 ? 


Show that the value of the integral 


(a — x) dy dz + (b — y) dz dx + (c — z) dx dy 
z [a — x)? + (b — y)? + (c — 2)? 


is independent of the choice of the surface È, provided its boundary T 
is kept fixed. By integrating over the outside of the surface, prove from 
this result that if X is a closed surface, then Q = 4r or 0, according to 
whether A(a, b, c) is within the volume bounded by È or outside this 
volume. 


9 = 
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13. 


14. 


15. 


Let the surface 2 be bounded by the closed curve T and consider the 
integral 
a — x) dy dz + (b — y) dz dx + (c — z) dx d 
(a,b,c) = ff ) dy ( 2, (c — 2) dx dy 
[r? = (a — x)? + (b — y} + (e — 2z)°], 
as a function of a, b, c. Prove that the components of the gradient of 
Q can be expressed as line integrals as follows: 


dQ _ f (@—c) dy — (y — b) dz 0 = f Eade —@~ ods 
ða r r3 ? 3b r r3 ? 
=f (y — b) dx — (x — a) dy 
Oc T r3 ` 


These formulae, which have an important interpretation in electromag- 
netism, can be expressed by the following vector equation 


x dx 
da=- f * , 
gra r [x]? 


where x is the vector with components (x — a), (y — b), (z — ©). 
Verify that the expression 
— 4xy dx + 2(x? — y2 — 1) dy 

is the total differential of the angle that the segment —1 <x <1,y=0 
subtends at the point (x, y). Using this fact, prove the following result 
by a geometrical argument: Let T be an oriented closed curve in the x, 
y-plane, not passing through either of the points (—1, 0), (1, 0). Let p be 
the number of times T crosses the line segment —1 < x < 1, y = O0 from 


the upper half-plane y > 0 to the lower half plane y < 0, and n the 
number of times T crosses this line segment from y < 0 to y > 0. Then, 


—4xy dx + (x? — y2? — 1) dy 
pa f det DA _ ony n), 
r (x? + y? — 1) + 4y? an(p — n) 

Thus, if T is the curve r = 2 cos 20 (0 <0 < 2r), in polar coordinates, 
0 = 0. 
Consider the unit circle C 

x = cos, y=sing, z =0 (0 <9 < 2r) 
in the x, y-plane. Denote by Q the solid angle which the circular disc 
x? + y? <1, z = 0, subtends at the point P = (x, y, z). Now let P de- 
scribe an oriented closed curve T that does not meet the circle C. Let 
p be the number of times T crosses the circular disc x? + y2? < 1, z = 0, 
from the upper half-space z > 0 to the lower half-space z < 0, and n 
the number of times T crosses this disc from z < 0 to z > 0. If P starts 
from a point P, on T with Q = Oo, then P, describing T (while Q varies 
continuously with P), will return to P, with a value Q = Q1. Prove by a 
geometrical argument that 


M — w= | dO = 4n(p — n). 


Using the vector equation found above, 


16. 


17. 
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PP’ x dP’ 
| grad 0 = — | pP 
(Exercise 13), prove that 
, x—x dx dx’ 
z —z dz dz 
(x — 2) (dy dz’ — dz dy’) + (y — y) (dz dx’ — dx dz’) 
_ + (z — z) (dx dy’ — dy dz’) 
= JrJe [(x’ — x)? + (y’ — 9)? + @ zp 
= 4n(p — n). 
[This repeated line integral, which is due to Gauss, gives the number of 
times T is wound around C. It should be remarked that its vanishing is 
necessary if the two curves T and C (thought of as being two strings) 


are to be separable, but not sufficient, as is shown by the example in 
Fig. 5.13, where p = n = 1, yet T and C cannot be separated.] 


Figure 5.13 


Let T be a closed curve in space on which a definite sense of description 
of the curve has been assigned. Prove that there is a vector a with the 
following characteristic property: for any unit vector n the scalar prod- 
uct aen is equal to the algebraic value of the area enclosed by the or- 
thogonal projection of T on the plane II orthogonal to n. (Note that n 
gives the orientation of II, and T gives the orientation of its projection 
on II.) In particular, the projection of T on any plane parallel to a has 
the algebraic area zero. (The vector a may be called the area vector of I.) 
Let f(x, y) be a continuous function with continuous first and second 
derivatives. Prove that if 
fazfyy — fay? + 0, 


the transformation 
u = fx(x, y), v = fy(x, y), w = — Z + xf2(x, yY) + yfy(x, y) 
has a unique inverse, which is of the form 
X = gulu, V), Yy = gu, v), Zz = — W + Ugu(u, V) + vgrlu, v). 
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18. Represent the gravitational vector field 
Z Ý ee o J _ __ 
X= V(x? + y2 + 22)38’ Y= V(x? + y2 + z238 
Z = — Z __ 9 
v (x2 + y? + 22)3 


as a curl. 


5.11 Integral Identities in Higher Dimensions 


The formulae of Gauss and Stokes discussed in the previous sec- 
tions all can be considered as extensions to more dimensions of the 
fundamental theorem of calculus 


b 
(76) J f'(x) dx = f(b) — f(a). 


That theorem expresses the integral of the derivative of a function of 
a single variable over an interval in terms of the values of the function 
at the boundary points of the interval. In a similar way, Gauss’s 
theorem 


(77) JI) Ge + gu + he) dx dy de = ff (Eren dS 


(n = outward-drawn normal) expresses an integral over a set R in 
terms of quantities taken on the boundary of R. In vector form, with 
A = (f, g, h) the divergence theorem becomes 


JI, div A dx dy dz = || A - n dS. 


Obviously, the expression div A plays the role of the derivative f’ 
in the simple formula (76). 

In three-dimensions we obtained in addition formulae expressing 
integrals of differential expressions over curves or surfaces in terms 
of boundary integrals. The curve integrals considered took the form 


(78) f A : t ds, 


(t = unit tangent vector of the curve C) and surface integrals the form 


[J A-nds 
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(n = unit normal vector to the surface S). There are bound to be 
restrictions on the vector A if integrals of these types are to be ex- 
pressible in a form that only involves boundary points of C or of S. 
The reason is that there are many curves or surfaces in three-space 
with the same boundary. An identity expressing an integral in terms 
of functions on the boundary alone implies that the integral does not 
depend on the particular curve or surface chosen and this can only be 
the case for vectors A of special types. 

Thus, we found that if the line integral of A-t over a curve C is to 
depend only on the end points P and Q of C, then the vector field 
A(x, y, z) has to be irrotational; that is, curl A = 0. If this condition 
is satisfied in a simply connected set containing C, we can find a scalar 
U = U(x, y, z) such that A = grad U = (Uz, Uy, Uz); in that case, 
we indeed have an integral identity of the desired type: 


J A-tds=J du= U(Q) — U(P). 


Similarly, for the surface integral 


f| A- nas 


to depend only on the boundary curve C of S, the vector A has to 
satisfy the necessary condition? div A = 0. If the condition div A = 
0 is satisfied, we can represent A in the form A = curl B (see p. 315) 
and express the integral of A+ n over the surface S in terms of an 
integral over C by Stokes’s theorem 


(79) J| A-nds= ff (curl B)-n dS = | B- tds. 


From these examples one would expect that there exist more gener- 
al formulae expressing appropriate combinations of derivatives of 
functions over an m-dimensional set in M-dimensional euclidean 
space as integrals of the functions over the (m — 1)-dimensional 


1Assume that the double integral of A +n over any surface S depends only on the 
boundary C of S. Then the integral is the same for any two surfaces with the same 
boundary if we define the direction n consistently on the two surfaces (i.e., so that 
the normal vectors n go into each other if one surface is deformed smoothly into the 
other). In case the two surfaces together form the boundary o of a set R in space, the 
integral of A + N over o is 0 if N denotes the unit normal of o pointing away from 
R. By the divergence theorem, it follows then that the integral of div A over R 
vanishes. Since R is arbitrary, we find by space differentiation that div A = 0. 
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boundary of the set. For m = M Gauss’s theorem (77) suggests an 
obvious generalization: 


|| -e re + FED dar- + + dau 


_ fe (rgb +--+ ee G9 dS. 


Here R is a set in M-space bounded by the (M — 1)-dimensional hyper- 
surface S with outward-drawn normal n, and ft, f*,..., fM are 
functions of x1, . . ., xm. On the other hand, the formula of Stokes 
in the form (79) has no such obvious analogue. However, the calculus 
of exterior, or alternating, differential forms leads one immediately 
to conjecture the general Stokes’s formula 


(80) SryJSao=Jf--Je 
S S 


for arbitrary differential forms œ of order m — 1 and arbitrary m- 
dimensional oriented surfaces S* with suitably oriented (m — 1)- 
dimensional boundary 0S*. In the Appendix to this chapter we shall 
prove the general formula (80) without using any new ideas beyond 
those already arising in the rigorous proof of the special cases (77) 
and (79). 


Appendix: General Theory of Surfaces and of 
Surface Integrals 


Rigorous proofs of the theorems of Gauss and Stokes and their 
extensions to higher dimensions require a more careful analysis of 
the notions of surface, of orientation of surfaces, and of integrals 
over surfaces. These are provided in the present appendix. 


A.1 Surfaces and Surface Integrals in Three Dimensions 


a. Elementary Surfaces 


Elementary surfaces are essentially the analogues of the simple 
arcs defined in Volume I, p. 334. They form the building blocks making 
up surfaces of more complicated structure. 
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An elementary surface o in x, y, z-space is a set of points P = 
(x, y, 2) represented parametrically by three functions, 


(la) x = f(u, v), y= glu, v), z= hy, v) 


where (1) the domain U of the functions is an open bounded set in the 
u, v-plane; (2) f, g, h are continuous and have continuous first de- 
rivatives in U; (8) the inequality 


_ fu fo Eu gv hu hv 
(1b) w=,/ Su Sv hu hy fu fv 


= V(fugo — fogu) + (Euhv — Evhu)? + (hufv — hofu) > 0 


2 2 2 


+ 


is satisfied at all points U; and (4) the mapping of the set U in the 
u, v-plane on the set o in x, y, z-space is 1-1 and the inverse mapping 
from o onto U is also continuous. 

The quantity W represents the length of the vector with com- 
ponents 


(2) A = Buho — Zhu, B = hufo — hafu, C= fug — fogu 


that is the vector product of the two vectors 


(3) (fu, Eu, hu) and (fv, 80, hy). 


The two vectors in (3) are tangential to the surface, while the vector 
(A, B, C) is perpendicular to those two and, hence, normal to the 
surface. Equation (1b) guarantees that there are only two directions 
normal to the surface, namely that of the vector (A, B, C) and of its 
opposite (—A, —B, —C). 

At each point of o, at least one of the three quantities A, B, C does 
not vanish. If, say, C Æ 0 at a point Po = (xo, yo Zo) corresponding to 
a parameter point (uo, vo) in U, we can find for a sufficiently small 
positive £ a number 6 > 0 such that each pair (x, y) with 


(4) V(x — x0)? + (y — yo)? <8 
is representable uniquely in the form 

(5) x = f(u, v) y= g(u, v) 
with 


(6) v(u — uo)? + (v — vo)? < €. 
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The values u, v determined by x, y are functions 


(7) u = (x,y), v = y(x, y), 


which are continuous and have continuous first derivatives for (x, y) 
satisfying (4). By the assumed continuous dependence of (u, v) on P 
we see that every point P on the surface o that is sufficiently close to 
Po has parameters (u, v) satisfying (6). If, moreover, the distance from 
P to Po is < 4, the coordinates x, y of P will satisfy (4). Thus, for all 
P on o sufficiently close to Po, we can express the parameter values 
u, v in terms of x, y by (7). On substituting these values in the equa- 
tion z = h(u, v), we then have a nonparametric representation 


(8) z = h(x, y), v(x, y)) = H(x, y), 


which applies to all points of the surface o that are sufficiently close 
to Po. If the quantity B does not vanish, we obtain similarly a local 
representation of the form y = G(x, z) and in case A + 0 a representa- 
tion of the form x = F(y, 2). 

The same elementary surface o has many different parameter 
representations, all of which, however, are related in a simple fashion. 
Let 


(9) ž=f(ū, 0), y= 84,0), Z= kh(ū, 0) for (āū, ū)inŪ 


be a second parameter representation for o also satisfying all our 
four requirements. The bi-unique and bi-continuous correspondence 
between U and o and between U and o establishes then a 1-1 and 
continuous mapping with continuous inverse of the set U onto the 
set U: 


(10) u = au, v), v= Bu, Dd) for (a, 0) in U. 


If, here, for a certain (žo, io) in U the corresponding values (uo, vo) 
are such that the quantity C(uo, vo) is not zero, then the representa- 
tion (7) applies for all (u, v) near (uo, vo), and hence, we find from (9) 
that 


u = ad, 0) = o(f(G, 0), BG, 6) 
v = BG, 0) = y(f(ā, 8), &, 0) 


for all (a, ù) sufficiently close to (žo, vo). Since ¢, y, f, g all are func- 
tions with continuous first derivatives, it follows that the functions 
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a, B describing the change of parameters (10) not only are con- 
tinuous but have continuous first derivatives as well. 
Putting 


(11) A= Xa o) 3a 30 30 aw 


we find from the rules for the Jacobian of the product of two map- 
pings [see (31b), p. 258] that 


w d(x, y) — d(x, y) d(u, v) — 
(12a) C = da, 5) ~ dlu, v) ` dla, 0) 7 C^ 


and, similarly, that 
(12b) B = BA, Å= AA. 


In particular, we find that the Jacobian of the mapping (10) be- 
tween the two parameter regions does not vanish, since by (12a, b) 


(13) W=V424 B24 C2 = VAYA? + B? + C2) = JAW 
and, by assumption, W + 0. 


Of course the same statements are valid for the expressions of &, Ù 
in terms of u, v. The important fact is that the relation between two 
parameter systems for the same elementary surface satisfy all of the 
assumptions made in the proofs of the transformation laws for areas 
and integrals. 


6. Integral of a Function over an Elementary Surface 


There is nothing difficult in the notion of a continuous function F 
defined in the points P of an elementary surface o. We just require 
that with every P € o there is associated a value F = F(P) in such 
a way that for a sequence of points Pn on o that converges to a 
point P of o, we have 


lim F(Pn) = F(P). 
nro 
In any particular parametric representation (la), F becomes a func- 


tion of u, vin the domain U and continuity of F on o becomes equiva- 
lent! to continuity of F as a function of u and v. 


1We make use here of the bi-continuous character of the relation between o and U. 
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We restrict ourselves here to continuous functions F on o that 
are zero outside some compact (i.e., closed and bounded) subset s 
of o. The corresponding parameter points (u, v) form then a compact} 
subset S of U. We then define the integral of F over the elementary 
surface o by the formula 


(14) J FdA = {[Fw du du, 


where W is the expression given by (1b). Here FW is continuous 
function of u, v, which we define as 0 for (u, v) outside S; hence, FW 
is integrable. One still has to show that the surface integral of F 
over o defined by (14) does not depend on the particular parameter 
representation (la). This follows immediately from the law of trans- 
formation (13) for W and from the general formula (16b), p. 403, for 
transformation of double integrals under a change of variables from 
u, v to ü, ù. Indeed, 


SSEW au dv = [frw ge 


= [[Fwial dā dö = [[rwaa ao 


di dv 


The independence of the integral of FW from the particular pa- 
rametric representation means that the differential form W du dv = dA 
is invariant; it can be identified with the element of area. 

It would be easy to extend the notion of integral over an elementary 
surface to more general functions, although we will not do so in 
the sequel. This involves the extension of the notion of Jordan- 
measurability to a set s whose closure is contained in the elementary 
surface o; we merely require that the corresponding set S of points 
(u, v) in the parameter plane be a Jordan-measurable set whose closure 
hes in U. It is seen immediately from the relations between different 
parameter representations that Jordan-measurability of s does not 
depend on the particular representation.? The same holds for the area 
of s that we can define as 


1For (Un, Un) E S and (un, Vn)> (u, v) the corresponding points Pa of o lie in s.Compact- 
ness of s implies that a subsequence of the Pn converges toward a point P of s. By 
continuity convergence of Pn to P implies convergence of the (un, Un) to the cor- 
responding parameter point in S. Thus, (u, v) € S, which proves that S is closed. It 
is bounded as a subset of the bounded set U. 

“See p. 539 
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A(s) = ff dA = J|. W du dv. 


Of particular importance are the sets s whose closure lies on o 
and that have area 0. They correspond to sets S in the u, v-plane of 
area 0; this means that S can be covered by a finite number of squares 
contained in U of arbitrarily small total area. 


c. Oriented Elementary Surfaces 


A particular parameter representation (la) of the elementary sur- 
face o is said to define a particular orientation of o (the one that is 
positive with respect to the u, v-system). Two parameter sets u, v 
and u, ù for the same elementary surface o are said to give o the 
same orientation if the Jacobian 


d(u, Ù) 
d(u, v) 


is positive throughout the parameter domains and to give the op- 
posite orientations if the Jacobian is negative throughout the pa- 
rameter domains. The combination of the elementary surface o with 
a particular orientation is called an oriented elementary surface o*. 

By our assumptions, the Jacobian cannot vanish. Since it is also 
a continuous function of the parameters, we can be sure that it has 
constant sign when the parameter domain is a connected set. In that 
case there are only two possible orientations for an elementary sur- 
face o that may be distinguished as o* and —o*. It is clear, however, 
that the number of possible orientations is larger for disconnected 
sets, where orientations of the parts of o corresponding to the differ- 
ent components of U can be changed independently of each other. 

Orientation of the elementary surface is intimately connected with 
picking a normal direction on o or with “distinguishing the sides” 
of o. A particular parameter representation (la) of o defines by 
formulae (2) at each point P quantities A, B, C that can be considered 
as the components of a vector perpendicular to o at P. This vector 
has the same direction as the unit vector with components 


_A _ B —_ C 
(15) E = W’ =p C= W 
When we change parameters from u, v to a, ù the quantities A, B, C 
change and are replaced by the proportional quantities A, B, C, 
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according to the laws (11) and (12a). Here the factor of proportionality 
is Just the quantity 


Hence, the unit normal (E, n, 6) is the same for equal orientations of o 
and opposite for opposite orientations. Equivalently, the orientation 
of o* picks out at each point a certain side of o, namely, that one 
to which the normal (E, n, ¢) points.! 

The orientation of o* can also assign a definite sense to every 
simple closed curve C lying on o by ascribing to C that sense that 
is positive on the closed curve y in the u, v-plane that corresponds 
to C with respect to the finite region enclosed by y. 

Specification of an orientation for the elementary surface becomes 
mandatory when we consider instead of integrals of the form ffF dA, 
where F is a scalar, an integral of a differential form 


(16) œ = a dy dz + b dz dx + c dx dy, 


where, say, a, b, c are continuous functions on o vanishing outside 
a closed and bounded subset. Here the natural interpretation for 
the integral suggested by the substitution formulae is, of course, 


Sfo- Sf eieo + oat) +e aen 
-= Í (aA + bB +cC) du dv 
= ffa + on +) Wdudv = ff (at + bn + c aA 


where we have made use of the relations (15) and (14). Here &, n, 6 
are the direction cosines of the normal determined by the choice of 
the parameters u, v; their sign depends on the orientation of our 
surface o. Thus, we first define the integral of over one of the 
oriented surfaces o* arising from o. We put 


d(y, z p Z, X d(x, 
(17) ffa o = {fle Tet a be o +c Te z du du 
b 
1This 1s 1This is the; positive side of o*, which depends on the orientation of the x, y, z-coordi- 
nate system; see p. 580. In the notation used on p. 581, we have 


Qo*) = Q(u, v). 
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= [cae + on + co aa, 


where u, v must be one of the parameter systems used to define the 
orientation of o* or connected with such a system by a substitution 
with positive Jacobian and where &, n, ¢ is the normal direction in- 
duced by the orientation of o*. If —o* is the elementary surface 
with the opposite orientation, we have 


(18) Se=- f o. 


d. Simple Surfaces 


Let o be an elementary surface with a parametric representation 
(la) where the parameter point (u, v) varies over the open set U. If 
U’ is any open subset of U, the points of o with (u, v) restricted to 
U’ clearly form an elementary surface o’ contained in o. Indeed, all 
four of our conditions immediately apply to o’, using the same pa- 
rameters u, v. As an example, we note that the points of o of distance 
< £ from a given point (xo, Yo, Zo) again form an elementary surface 
(if not empty), for those are the points whose parameter values u, v 
satisfy 


(19) [f(u, v) — xo]? + [g(u, v) — yo]? + [A(u, v) — 29]? < 2, 


and since f, g, A are continuous functions in U, the set U’ of such 
points (u, v) is open. 

It is less obvious that the most general elementary surface o’ con- 
tained in the elementary surface o can be obtained by restricting the 
parameter domain of o to a suitable open set. 

For the proof, let the elementary surface o have the parametric 
representation (la) for (u, v) € U. Let o’ be an elementary surface with 
the parametric representation (9) with (a, 0) varying over the set Ü. 
Let o’ be a subset of o. Then every (a, 0) € Ü determines a point P € o, 
which in turn determines a point (u, v) € U whose coordinates are 
functions of a, ù: 


(20) u=a(é,v),  v=ßā, 0) for (āo) Ù. 


The set Ü is mapped by (20) onto a subset U’ of U. It is clear then 
that the set o’ arises from o by restricting the parameter points (u, v) 
to the subset U’ of U. It only remains to see that U’ is open. Let P) = 


682 Introduction to Calculus and Analysis, Vol. II 


(Xo, Yo, Zo) be a point of o’ corresponding, respectively, to the parameter 
points (üo, Yo) in U and (ug, Up) in U’. Let C and C be both different 
from 0 at that point.1 Then a neighborhood of (üo, Uo) is mapped by 


x= f(ū, 0), y= B(é, ù) 


onto a set in the x, y-plane that covers a neighborhood of (xo, yo); the 
corresponding points (u, v) obtained from (7) then cover a neighbor- 
hood of (up, Up), so that U” is seen to be an open set. 

We see in addition that the two surfaces o and o’ agree in a suf- 
ficiently small neighborhood of Po, since every P on o sufficiently 
near Py has parameter values (u, v) arbitrarily near (Up, vo); thus, for 
P sufficiently close to Po, we have (u, v) € U’, since (uo, Vo) is an in- 
terior point of U’, and hence, we see that P € o’. We have proved: 


If the elementary surface o' is contained in the elementary surface 
o and if Py is a point of 0’, then we can find a sufficiently small neigh- 
borhood of Po in which o and o’ agree. 

Any orientation imposed on the elementary surface o immediately 
determines a unique orientation on any elementary surface o’ con- 
tained in o. We need only refer o’ to the same parameter system that 
defines the orientation of o and take that system to fix the orientation 
of o’. 

We are now in a position to give precise meaning to the more 
general notion of a simple surface, as an object “patched together” 
from elementary surfaces: 

A set tin x, y, 2-space is called a simple surface if for every point 
Po on t there exists an £ >0 such that the points of t that have 
distance less than £ from Po form an elementary surface. 

Thus, for every Py € t there is an elementary surface o that agrees 
with t near Py and is contained in t. We can show that the inter- 
section of two elementary surfaces o’ and o” contained in the simple 
surface t is again an elementary surface (if not empty), for if Po is 
a common point of o’ and o”, we can find an &-neighborhood N: of Po 
such that o = Ne () tis an elementary surface. Here o contains the 
two elementary surfaces N: No’ and Ne No”. Consequently, o’ and 
o” agree with c, and thus with each other, at all points sufficiently 
near to Pp. If o’ is referred to parameters u, v with Up, Vo corresponding 
to Po, all (u, v) sufficiently close to (Uo, Vo) will correspond to points 


1We can assume that all three quantities A, B, Č are # 0 at Po, applying, if necessary, 
a suitable rotation to x, y, 2-space. At least one of the quantities A, B, C does not 
vanish at Po; let it be C. 
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of o’ that lie in o”. Hence, the parameter points (u, v) corresponding 
to points (x, y, z) in o’ N o” form an open set. Thus, o N o” is an 
elementary surface. 

We define an oriented simple surface analogously: 


The simple surface t is oriented if t is represented as the union 
of elementary surfaces each of which has been given an orientation, 
provided the orientations agree in the intersection of any two of the 
elementary surfaces. Two orientations of t are considered identical 
if they lead to the same orientations at the points common to any two of 
the oriented elementary surfaces used in defining the orientations of 
t. Equivalently, two orientations are identical if they lead to the same 
choice of a normal direction at each point of t. 

A case of special importance arises when the simple surface t 
is the boundary of a set R in x, y, 2-space. We assume here that R 
is the closure of a bounded open set.! In that case, we can assign an 
orientation to t for which the positive sense assigned by the orien- 
tation to each normal of t is that of the “direction pointing away from 
R” or that of the “exterior normal.” Indeed, for each point P) = 
(Xo, Yo, Zo) on t, we can find a neighborhood in which t agrees with an 
elementary surface. We can even choose the neighborhood so small 
that t can be represented nonparametrically in that neighborhood, 
say, by an equation 


(21) z= F(x, y) valid for (x — xo)? + (Y — Yo)? < £? 


If two points P and P’ in space can be joined by an arc that contains 
no point of the boundary t of R, either both or neither lie in R. This 
is clearly the case for any two points satisfying either condition 


(22a) F(x, y) < z< F(x,y) +5, (x — xo)? + (y — Yo)? < €? 
or 
(22b) P(x, y)— < z< F(x, y) (x — xo)? + (y — yo)? < £, 


provided 6 is a sufficiently small positive number. Thus, each of the 
to sets (22a) and (22b) either is completely contained in R or has 
no points in common with R. They cannot both be contained in R, 
for then the set (21) also would belong to R, since R is closed; but then 
Po would not be a boundary point of R. Neither can both sets be free 
of points of R, since then Py could not be a limit of interior points of 


1This means that R is closed and bounded and that every boundary point of R is 
the limit of interior points. 
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R. Thus, exactly one of the sets (22a) and (22b) is contained in R. If 
(22b) is the set contained in R, we choose the parameters u = x, v = y 
to assign an orientation to the elementary surface (21), writing 


x=UuU, Y=U, z= F(u, v). 


The corresponding normal direction has directio ncosines [see (2) and 


(15)] 


F Rod 
E=- p> — yw’ =g: 


Since ¢ > 0, the normal at any point of the surface points away from 
R, in the sense that any point on the normal at a point of (21) that is 
sufficiently close to the surface will lie in the set (22a) and, hence, 
outside R. Similarly, if the set (22a) belongs to R, we define the orien- 
tation of (21) by the parametric representation 


x =U, y = u, z= F(u, v), 


which leads to 6 = —1/W < 0 and again singles out the normal di- 
rection away from R. 

We have thus represented t as a union of oriented simple surfaces, 
where, because of the geometric meaning of the orientation in re- 
lation to the set R, orientations agree in overlapping simple surfaces. 
We call t oriented positively with respect to R 1. 


e. Partitions of Unity and Integrals over Simple Surfaces 


Given a simple surface t, we wish to define 


J| Faa 


under the assumption that F is a continuous function on t that 
vanishes outside some closed and bounded subset s of t. (In case 
the whole surface t is closed and bounded, the definition will furnish 
the integral over t of an arbitrary continuous function on t.) We 
make use of a device known as partition of unity to reduce our in- 
tegrals to integrals over compact subsets of elementary surfaces that 
have been defined already. 


1We assume here that R has the orientation of the x, y, z-coordinate system. 
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A partition of unity consists of a finite number of functions x:1(P), 
xy2(P), . . ., XYw(P) defined and continuous in the points P of the set 
s with the properties: 


1. x(P) = 0 forall P € sandi=1,...,N; 
2. Xı(P) + xP) + » - - + x%x(P) = 1 foral Pes 
3. for each i=1,..., N there exists an elementary surface ci 


contained in t such that y:(P) = 0 for Pin s outside a certain compact 
subset of oi. 


(It is, of course, property 2 that accounts for the name partition of 
unity). 

Assume that we have such a partition of unity for s. We can write 
for PEs 


(23a) F(P) = FP) ~(P) + FP) xP) + ++ - + FR) xP). 


Here each term is defined and continuous for P in s. However, since 
F(P) is assumed to be defined and continuous on the whole of t and 
to vanish outside the set s, we can extend each term F(P) yi(P) over 
the whole of t as a continuous function just by defining F y% as 
zero for points of t not in s. 

We then define the integral of F over t by the formula 


(23b) [[ Faa =} f| FudA 


Here the integrals on the right have a meaning since F %ı is con- 
tinuous on the elementary surface o; and vanishes outside a com- 
pact subset of o;. 

To complete the definition, we have to show that the expression 
(23b) for the integral of F over t does not depend on the particular 
partition of unity used. Assume that we have a second partition con- 


sisting of functions y1/(P), x2 (P), . . ., Xm (P) vanishing, respec- 
tively, outside compact subsets of elementary surfaces 61, . . ., Om’. 
For each i = 1,. .., N and k = 1, . . .,m the set 

Oi (| Ox’ 


is again an elementary surface (if not empty), since both o; and ox’ 
lie on t. Moreover, the function F y: yz’ vanishes outside a compact 
subset of that surface. Hence, formula (23b) yields 


686 Introduction to Calculus and Analysis, Vol. II 


f| Faa = HIRA dA 


= D || Fux dA 
= Z Sa Fem 44 
= z J| Fu dA 
= HRA dA, 


which shows that a different partition leads to the same value for the 
integral, 

It remains to exhibit an actual partition of unity. By definition, 
we have for every point Q of the simple surface t a number &g > 0 
such that the points of t within distance &g from Q form an elementary 
surface og. We associate with Q the function of P defined by 


(24a) yo(P) = 


Here PQ denotes the distance between the two points P and Q. The 
function Wo(P) is defined and continuous for all P in space and, 
hence, in particular, is continuous on og. The number &g can be 
chosen so small that the set of points P on og for which PQ < 
4&9 is closed.1 These points then form a compact subset of og outside 
of which the function ye(P) vanishes. 


1The reason is that all points P in the closure of an elementary surface o that are 
sufficiently near to a given point Q of o have to belong to the set o itself: Let o cor- 
respond to the open set U in the parameter plane, with Q corresponding to a point 
q. Let Pn be a sequence of points on o with images pn in U, and let Pn > P. For Pn 
sufficiently close to Q the pn lie in a closed disc about q contained in U. A subse- 
quence of the pn converges to a point p of U. The point on o corresponding to p is 
just P. Now by definition of t there exists a positive 5g such that the points P of t 
with PQ < gg form an elementary surface o. There exists then a positive £ẹ S 59 
(depending on the choice of 5g) such that the points P of the closure of o for which 
PQ = } eg belong to o. Let cg C o denote the set of points P of t with PQ < eg- 
Then the closure of the set of points P of og with PQ S ł eQ belongs to o, and 
hence also to og since $ £Q < €g. 
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We take now for each Q on t the open ball of radius 4¢g in which 
the function We is positive. By the Heine-Borel theorem a finite 
number of these balls, say the ones with centers Qi, . . ., Qn, already 
covers the closed and bounded set s. We then define the partition 
functions yi fori =1,..., N by 


__ WWP) 

(24b) WP) = SOAP) F + +» + VolP) 
Here the denominator is different from zero for each P in s, so that 
xi(P) is defined and continuous in s. It is clear that in s the y(P) 
are nonnegative and have sum 1. Moreover, xi(P) = 0 outside a 
compact subset of the elementary surface og,;. Thus, the y:(P) form 
a partition of unity. 

Having defined the integral of a function F over a simple surface, 
we can immediately obtain the integral of a differential form 


(25a) œ = a dy dz + b dz dx + c dx dy 


over an oriented simple surface t*, assuming the coefficients a, b, c 
to vanish outside a compact subset s of t*. We simply take 


(25b) J| o= Sf GE + bn + cf) aa, 


where T is the unoriented surface and &, n, ¢ are the direction cosines 
of the normal singled out by the orientation of t* with respect to the 
coordinate axes. 


A.2 The Divergence Theorem 


a. Statement of the Theorem and Its Invariance 


In several variables the role of the fundamental theorem of cal- 
culus, which connects the operations of differentiation and inte- 
gration, is played by the Gauss divergence theorem. Under suitable 
assumptions, for a set R in x, y, z-space with boundary surface t the 
theorem takes the form 


(26) IJ @e+ by + Cz) dx dy dz = |] (a& + bn + cl) dA, 


where €, n, ¢ denote the direction cosines of the exterior normal (i.e., 
of the normal pointing away from R) in the points of t. 
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We shall prove the theorem here under the assumptions that R is 
the closure of an open bounded set in x, y, z-space and that the bound- 
ary of R is a simple surface. The functions a(x, y, z), b(x, y, 2), 
c(x, y, Z) shall be continuous in R and have continuous and bounded 
first derivatives in the interior points of R. 

An important feature of formula (26) is its invariance under rigid 
motions of space. This fact is more easily verified if subscripts rather 
than different letters are used to distinguish variables. We replace 
the quantities x, y, z by x1, x2, x3 and a, b, c by a1, a2, az, and &, n, ¢ by 
E1, E2, Es. Formula (26) becomes 


(27a) {f » = dxı dxz2 dx3 = ff >, a & dA, 
R i OX4 tr i 


where i = 1, 2, 3. Of course, the analogous formula with i ranging 
from 1 to n holds in n dimensions. | 

A rigid motion is given by a linear transformation from x- to y- 
variables of the form 


(27b) Xi = 2 Cik Yk + di 


where the cix and di are constants and the cix satisfy the orthog- 
onality relations [see (47) p. 156] 


0 forj#kR 


(27c) 2a Cu C= | i forj= k. 
The same law of transformation, but with the “inhomogeneous” terms 
dı omitted, applies to vectors, since their components are just differ- 
ences of the coordinates of their end points. Thus, we associate with 
the a; the components bx of the same vector in the new system deter- 
mined by 


ai = > Cik bk 


This law of transformation also applies to the direction cosines of the 
normal on the boundary, which are just the components of the 
exterior unit normal. The new direction cosines nx are connected with 
the €; by the formulae 


bi = ps Cik Nk. 
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Then, obviously, 


axi dbr _ yn br 


Oat _ bk _ 
> Oxi > Cik 3x > Ayk Axi Tyr 


i,k 
where we have made use of the chain rule of differentiation (see p. 
p. 208-209). Similarly, using (27c) 


Dd) ubi = Dd) cexbecigng = Dd) benk 
i ijk ie 


Hence, (27a) implies that 


{ff >> mh dy dyz dys ={{x brenk dA 
k OYkK k 


and, thus, represents a relation that is invariant under rigid motions 
of space.: 


b. Proof of the Theorem 


The proof of the general formula (26) is again simplified considerably 
by the use of partitions of unity. This device permits us for a given 
region È with boundary t to reduce the formula for general a, b, c 
to the case where a, b, c are zero except in the neighborhood of a 
point. We shall prove the following: 


If every point Q in R has a neighborhood of radius &g such that (26) 
holds for all a, b, c vanishing outside that neighborhood,* then the 
formula holds for general a, b, c. 

For the proof of this assertion, we use the auxiliary functions 
We(P) defined by 


(cg? — 4PQ*)? for PQ < t EQ 
WoP) = i 
0 for PQ 2 = Eo 


1The invariance of the volume element follows because the Jacobian of the trans- 
formation (27b), that is, the determinant of the cix, has the value +1 (see p. 175), 
while that of the surface element dA = W du dv follows by transforming the ex- 
pression (1b) for W. 

2We consider only functions a, b, c satisfying the assumptions stated: They are 
continuous in R and have continuous derivatives in the interior points of R. 
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that are continuous and have continuous first derivatives for all P. 
Since R is closed and bounded, we can pick a finite number of points 
Q, say Qı, Q2, . . ., Qu, such that the corresponding balls PQ; < 
4 eq; cover all of R. We again introduce functions 


Do WP 
uP) = P b+ + Yol P) 


that are defined and have continuous first derivatives in all points P 
of R and, besides, satisfy the conditions for a partition of unity 


(a) y(P)20 inR 
(b) 3 4(P) = 1 
(c) y(P)=0 for PQ: >to; 


The function a can then be decomposed into 
a= 2 a Xi 


where the individual terms a Xx: are again continuous in R and have 
continuous first derivatives in the interior points of R. Similarly, b 
and c can be decomposed. Then, since formula (26) applies to the 
individual terms, it obviously applies to the whole expression. 

Hence, we only have to prove (26) for functions a, b, c vanishing 
outside an arbitrarily small neighborhood of a point Q. We distinguish 
the cases of Q in the interior of R and Q on the boundary surface t. 

For a point Q interior to R, we choose &g so small that the ball of 
radius 2&9 and center Q lies in R. For a, b, c vanishing outside the 
ball of radius &g, the surface integral vanishes and we only have to 
prove that 


(28) ffa + by + cz) dx dy dz = 0 


Here a, b, c are defined and have continuous derivatives in the whole 
space if we put a = b = c = 0 outside R. The first derivatives of a, b, c 
are integrable over every parallel to the coordinate axes. Applying 
formula (29), p. 531 for the reduction of a triple integral to single 
integrals we find, for example, 
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[[ce dx dy dz = Í h(x, y) dx dy 
where 
h(x, y) = Í C(x, y, z) dz = 0. 


In this way (28) is established. 

Now consider the case where Q is a boundary point of R. We can 
assume that the normal of the surface t at Q is not parallel to any of 
the three coordinate planes; this can always be brought about by a 
suitable rigid motion of space, which does not change the formula to 
be proved. In a neighborhood of Q of sufficiently small radius £ẹọ, no 
normal will be parallel to a coordinate plane; that is, none of the 
direction cosines &, n, ¢ will vanish. If the neighborhood is sufficiently 
small, the portion of t contained in it can be represented nonpar- 
ametrically, expressing any one of the three variables x, y, z as a 
function of the other two. For example, we can represent t by an 
equation 


z = F(x, y) 


The set R in that neighborhood will be characterized either by z < 
F(x, y) or by z = F(x, y); (see p. 633). We assume, with no loss of 
generality, that R is characterized locally by z < F (x, y); the exterior 
normal of t then has the direction cosines &, n, C where €¢ > 0. For 
a, b, c vanishing outside the neighborhood, and using u = x and v = y 
as surface parameters, we have 


(29) f| ct dA = ff c dx dy, 


in agreement with our orientation. On the other hand, continuing c 
as 0, where not defined,! 


Í i J cz dx dy dz = Í ll f Cran cz dx dy dz = fha, y) dx dy, 


1The corresponding function cz is then bounded and continous except in the set of 
points (x, y, z) near Q for which z = F(x, y). This latter set has Jordan measure 
zero. Hence cz (x, y, z) is Riemann integrable as a function of x,y,z, and also as a 
function of z alone for fixed x, y. (See footnote 2 on p. 407). Thus fromula (29), p. 531 
applies. 
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where 
F(x,y) 
h(x, y) =f "ce (x, y, 2) dz = ex, y, F(x, y). 


Only points near Q contribute to the integrals, so that the function 
F(x, y) also has to be defined only for (x, y, z) near Q. Comparison 
with (29) establishes that 


J c% dA = [|f c- dx dy az. 


Similarly, with y, z or x, z as parameters, it also follows that 


[J aba = fff azdxdydz, fondA = fff by dx dy dz 


This completes the proof of the divergence theorem (26). 


A.3 Stokes’s Theorem 


We consider a simple surface t, which need not be closed. Given 
a subset o of t we define the relative interior of o (that is “relative” to 
the surface t) as the set of points P of t with the property that in some 
suitable neighborhood of P all points of t belong to o. Similarly, the 
relative boundary of o consists of the points P of t for which every 
neighborhood contains points of t belonging to o as well as points of 
t not belonging to o. The set o is relatively open if each of its points 
is a relatively interior point. 

We now consider a closed and bounded subset s of t that shall 
consist of a relatively open set o and of its relative boundary. This 
relative boundary shall be a simple closed curve C, given parame- 
trically in the form 


(30) x = a(t), y= Bd), z= Y, 


where a, B, y are functions of period p with continuous first deriva- 
tives, for which oa’? + B2 + y? > 0 for all t. We assume that the 
surface t is oriented and that &, n, ¢ are the direction cosines of the 
positive normal on the oriented surface t*. We can then assign a 
special orientation to the curve C determined by the orientation of t 
and by the “side” of C on which o hes and, thus, make C into an 
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oriented curve C*. This “positive” orientation of C with respect to 
t* can be defined in two equivalent ways. In x, y, 2-space the tangent 
vector of C corresponding to the direction of increasing ¢ points in 
the direction given by the vector (a’(t), BP, y’(@ ). The exterior 
product of this tangent vector and of the surface normal (E, n, ¢) is 
the vector with components 


(31) BPE- Yn, YE- ab, a'n — BF. 


Its direction, which is perpendicular to that of the tangent of C and 
tangential to the surface, gives a distinguished normal direction for 
C relative to the surface. The orientation assigned to C shall now be 
that of increasing t if the vector (31) points away from s and that of 
decreasing ¢ if it points into s. 

A different way of arriving at the same orientation uses the 
parameter representation for t in the neighborhood of the point P: 


(32) x = f(u, v), y= glu, v), z= hlu, v) 


where we assume that the parameters u, v are those defining the 
orientation of t near P, that is, that the vector (A, B, C) defined by 
(2), p. 625 points in the direction of the distinguished normal of t 1. The 
curve C near P will be mapped onto an arc y in the u, v-plane; the set s 
near P will be mapped into a set p in the u, v-plane. We can define 
the orientation of C as that corresponding to the positive orientation 
of y with respect to the set p, in the sense imparted by the orientation. 
We could also say that the orientation of y is that of increasing t 
if the vector with components dv/dt and —du/dt points away from p. 

Given now three functions a(x, y, z), b(x, y, z), c(x, y, Zz), which 
are defined and have continuous first derivatives in a neighborhood 
of the set s, Stokes’s theorem is represented by the formula 


(33) JJ Wen — ba) & + (az — ca) + (bz — ay) 0] dA 
= fads + b dy + c dz). 


The proof of the theorem follows a pattern that should be familiar 
to the reader by now. By using a suitable partition of unity, we can 
restrict ourselves to the case where the functions a, b, c vanish out- 


1The parametric representation (32) of t is only local (i.e., valid near the point P). 
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side an arbitrarily small neighborhood of a point Q of s. Near this 
point the surface t has a parametric representation of the form (82) for 
which the normal vector with components A, B, C given by (2), p.000 
has the direction fixed by the orientation of t*. We can write 


We [(cy — bz) + (az — can + (bz — ay] dA 
= J [(cy — bz)A + (az — cz)B + (bz — ay) C] du dv 
= ff (Au + Ww) du dv, 


where 
À = aXy + byv + Czo, —u = axy + byy + c2u, 


as is easily verified algebraically by substituting the expressions 
(2), p. 625 for A, B, C and using the chain rule of differentiation 


Qu = Azfu + Ayu + azhu, 


and so on.! 


If Q is now a point in the relative interior of s, then the functions 
Au, v) and (u, v) vanish near the boundary y of p, and from the 
divergence theorem for two dimensions, we find 


ff (Au + Ww) du dv = 0. 


On the other hand, if Qis on the relative boundary of s the correspond- 
ing point in the u, v-plane lies on y and A, p vanish outside a small 
neighborhood of that point. In this case again, the two-dimensional 
divergence theorem yields 


[J Qu + Ho) du dv = | Qp + va) dy, 


where dy is the element of length and p, q are the direction cosines 
of the normal pointing away from p on the curve y. Describing yin 
the positive sense with respect to p, we have 


1Formula (63b), p. 321is another version of this identity with L = a dx + b dy +c dz, 
à = L/dv, p = L/du. 
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(A du — udu) 


* 


J (Ap + uq) dy = 


lI 
on — —— 


y (@Xu + byu + CZu) du + (axXy + byy + czy) du 


, (a dx + b dy + c d2), 


which was to be proved. 


A.4 Surfaces and Surface Integrals in Euclidean Spaces of 
Higher Dimensions 


a. Elementary Surfaces 


Let Em be M-dimensional euclidean space referred to Cartesian 
coordinates x1,..., xm. We first define m-dimensional elementary 
surfaces” in Ey as sets of points that can be represented “nicely” 
with the help of m parameters. We say a set S in Ey is an m-dimen- 


sional elementary surface if we can find M functions f'(w1,.. . ., Um), 
f2(u1,. . ., Um), . . ., fM(Uu1, . . ., Um) defined in an open set U of 
Ui, U2, . . ., Um-Space with the following properties: 


1. The equations 
xı = f'(ui,.. ., Um), . . ., Xm = fM(U1, . . . ., Um) 


define a 1-1 continuous mapping of U onto S whose inverse is also 
continuous. 

2. The functions f‘(uw1, . . ., Um) have continuous first derivatives 
in U. 

3. For any point (u1, . . ., Um) in U and for i = 1,. .., m, let 
At = Aî (u1, . . ., Um) be defined as the vector in Em with components 
(fuit, fus - - -, fu). We require that the m vectors At be independ- 
ent, that is, that 


(34) W = vi(Al A3... A) > o0, 


where T is the Gram determinant defined by (81a), p. 194. 
One proves, as on p. 626, that if we represent S in the same man- 


ner with the help of some other parameters vı, ..., Um, there is a 
1-1 continuously differentiable relation between corresponding 
parameter points (u1, . . ., Um) and (v1, . . ., Um) with a nonvanishing 


Jacobian: 
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d (ui, , Um) 
(35) d (vi, Om) É 0. 
If F(x1,..., xm) is a function defined and continuous on the 


elementary surface S which has compact support on S (that is, F van- 
ishes outside a closed and bounded subset of S), we define! the integral 
of F over S by 


(36) Jf- JE aS = ff- -+ | FW du- - - dun. 


The integral defined in this manner does not depend? on the par- 
ticular parametric representation used for S. 

At a point P, of S we form the corresponding vectors A‘, give them 
initial point Py), and denote their final points by Pi, so that At = 


P,P.. The m + 1 points Po, Pı, . . ., Pm lie in an m-dimensional 
plane Do, the tangent plane of S at Po. If po is endowed with an orien- 
tation (see p. 200), converting it into the oriented tangent plane 
Do* we have 


(37a) Q(po*) = e(po) QAT, . . ., A”), 


where &(po) has either the value +1 or —1. We call the surface S 
oriented if at every point P of S we orient the tangent plane p* = 
p*(P) so that the orientation depends continuously on P; that is, for 


Q(p*) = Q(B, . . ., B") 
with suitable vectors B1, . . ., B™ in p*, we require that? 
[B1(P),. . ., B@(P); B(Po), . . ., B™(Po)] > 0 
1The cube with edges of length h parallel to the coordinate axes in u1, . . ., Um-Space 
is mapped up to terms of higher order onto a parallepiped in x1, . . ., xm-space 
spanned by the vectors hA!, . . ., hA™ and, hence, of m-dimensional volume 


VITRAL ..., AA”) = h™W. 
This makes it plausible that dS should be identified with the element of volume in 
Ui, . . ., Um-space multiplied by the factor W. 
2To prove this, we observe that under changes of parameters, W is multiplied by the 
absolute value of the Jacobian of the parameter transformation, for such a trans- 
formation results in a linear substitution for the vectors A‘ that changes the volume 
W of the parallelepiped spanned by the vectors only by a factor equal to the deter- 
minant of the substitution (see p. 202). 
3The symbol in brackets stands for the determinant defined by (85a), p. 198. 
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for all points P on S sufficiently close to a point Po. Since the vectors 
At vary continuously with the point P of contact, the orientation of p* 
varies continuously with the point of contact P if the factor e(P) 
defined by (37a) varies continuously with P on S. Since £ can only 
have the values +1 or —1, it follows, as on p. 579, that for a connected 
elementary surface there are only two possible orientations. In any 
case, the oriented surface S* determines an orientation of the set 
U in the parameter space u1, . . ., Um, namely, the one given by 


(37b) Q(U) = &(P) Qui, . . ., Um) 


[see (40n, 0, p), p. 580-1]. Here, under a change of parameters from 
Ui, ..., Um tO V1, . . ., Um the quantity £ is just multiplied by the 
sign of the Jacobian (35). 


b. Integral of a Differential form over an Oriented Elementary 
Surface 


After these preliminaries we are ready to define the integral of an 
mth-order differential form œ over an m-dimensional oriented el- 
ementary surface S*. The form œ is some linear combination of 
ordered products of m of the differentials dx1, . . ., dxm at a time, say, 


o = a dxı dx2° ++ dxm + b dxz dx3 + + + dxXm+i + c dx1 dx3 + + +dxm 
peee, 


where the coefficients a(x1,..., xm), b(x1,..., Xy), - - . are as- 
sumed to be continuous and to have compact support on S*.! Let 
S* be represented parametrically with the help of parameters u1, . . ., 
Um that vary over the set U*, oriented in accordance with the orien- 
tation of S*. We then define 


fafo fin fatma: -dum 


=f- oe fla Axi, X2, . . +) Xm) + p axe, X3, - - +» Xm1) 


d(ui, U2, . . ., Um) d(u1, U2, . . ., Um) 


tees | dur- ++ dum, 


1That is, a, b,c, . . . vanish outside some closed and bounded subset of S*. 
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Our notation! has been arranged in such a way that the value of the 


integral does not depend on the particular parameter representation 
used for S*. 


c. Simple m-Dimensional Surfaces 


By “patching together” elementary surfaces, we can obtain simple 
surfaces just as in three-space. A set t in M-dimensional euclidean 
space is called an m-dimensional simple surface if each point Pp of t 
has a neighborhood intersecting t in an elementary m-dimensional 
surface. If each of the elementary surfaces occurring in the character- 
ization of a simple surface is oriented and if the orientations of two 
of these elementary surfaces agree, whenever they overlap we say 
that the simple surface t has been oriented. 

At each point of an m-dimensional oriented simple surface t* we 
can choose m vectors Al(P), . . ., A™P) such that 


Q(1t*) = Q[AWP),. . ., A™(P)] 
and 
[Al(P),. . ., A™M(P); AQ), . . ., A™(@)] > 0 


for Q sufficiently close to P. 

For subsets s of an m-dimensional simple surface t we can define 
the relative boundary? of s, that is, the boundary of s relative to the 
surface t. The relative boundary of s consists of those points of s 
for which each neighborhood contains points of s and points of t not 
belonging to s. The relative closure? of s consists of s and of relative 
boundary points of s. The set s is called relatively open if it has no 


1Here, for a continuous integrand F(u1, . . ., Um), the integral of F over an oriented 
set U* with orientation 


Q(U*) = EQ(u1, .. . , Um) 
(e = +1 and continuous) is defined by 


Nene JEt: © ¢dum = [Jp J Fedu: e «© dum 


where the integral on the right side has the ordinary meaning that gives positive 
values for positive integrands. 

2This notion is needed when we want to discuss, say, the boundary curve of a two- 
dimensional surface s in spaces of dimensions M > 2. The (“absolute”) boundary of 
the surface s taken with respect to the whole space always contains the whole 
surface s. 

3The relative closure of s also is the set of all points of t that are limits of sequences 
formed from points of s. 
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points in common with its relative boundary and called relatively 
closed if it contains its relative boundary. 

Of particular interest is the case where s is a subset of the m- 
dimensional simple surface t whose relative boundary itself is an 
(m — 1)-dimensional simple surface ds. We assume furthermore that 
s is the relative closure of a relative open set. In the neighborhood of 
a point P of ds we can always represent ds and t ‘“nonparametrically”’; 
that is, we can use some of the Cartesian coordinates x1, ..., xm in 
space as independent variables; after a suitable renumbering of 
coordinates we then have for t near P the parametric representation 


xi = filxi, . . ., Xm) @=m+1,...,™M), 
and on ðs we have an additional condition 
x1 = 2(x2, e. Xm) 


with continuously differentiable functions f; and g. Moreover, the 
points of s are characterized near P by either the inequality 


g(x2, . . ., Xm) S x1 
or by 
g(x2,. . ., Xm) Z xı. 

If we deal with an oriented set s*, we can assign a unique orien- 
tation to the relative boundary ðs. Let there be given m — 1 inde- 
pendent vectors A2,..., A™ at a point P of ds that are tangential 
to ðs and an additional vector A! that is tangential to t but not to 
ds at P and that points away from s*. We then have 


(38) Q(s*) = eQ(Al, . . ., Am-1, Am) 


where € has either the value +1 or —1. The boundary ds* is then 
called oriented positively with respect to s* if 


(39) O(ds*) = eQ(A2, . . ., A™). 


In particular, let m = M and t be the whole M-dimensional space. 
Let s be the closure of an open! set and let the boundary of s be an 


1We can omit here the word relative. 
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(m — 1)-dimensional simple surface ds. Assume that in a neighbor- 
hood of a point P the surface ds has the nonparametric representation 


xı = g(x2,.. ., Xm). 


We can define a quantity 6 = + 1 so that 


(40a) [x1 — g(x2,..., Xm)|5 < 0 
for points (xı, . . ., Xm) in s near P. We choose for A?,.. ., Am the 
vectors 

A? = (20, 1, 0, e e o9 0, 0), . e %93 Am = (Etm, 0, e e o9 0, 1) 


tangential to ðs, and for A! the vector 
Ai = (6, 0,.. ., 0) 
that points away from s. Then in xı, . . ., Xm-coordinates 
det (A1, . . ., Am-1, Am) = ô, 
so that [see (83a, b), p. 197] 
Q(At, . . ., Am-1, A™) = 60O(x1, . . ., Xm). 
For the oriented set s* let £ = + 1 be defined near P by (38). Then, 
(40b) OQ(s*) = 660(x1, . . ., Xm), 
while for the boundary ds* oriented positively with respect to s*, 
relation (39) holds. Consequently, if x2,. . ., Xm are considered as 


parameters for the surface ðs* near P then the orientation of x2, . . 
Xm-Space determined by ðs* is 


°? 


(40c) EQ(x2, . . ., Xm) 


[see (387b), p. 647]. Thus, for a set s* oriented positively with respect 
to xı, . . ., Xm-coordinates (£ = 1), the positively oriented boundary 
has the orientation of the x2, . . ., Xm-system where s lies “below” the 
boundary, and the opposite one where s lies “above” the boundary 
(compare p. 634). 
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A.5 Integrals over Simple Surfaces, Gauss’s Divergence Theo- 
rem and the General Stokes Formula in Higher Dimensions 


We define integrals over simple surfaces by means of partitions of 
unity exactly as on p. 635. In particular, if t* is an m-dimensional 
oriented simple surface and œ an mth-order differential form the 
integral 


is defined provided the coefficients of œ are continuous and vanish 
outside a bounded and closed! subset of t*. 

Now let t be an m-dimensional simple surface in M-space and s* 
an oriented bounded and closed subset of t. We assume that s* is 
the closure of a relatively open set and that the relative boundary 
of s*, oriented positively with respect to s*, is an(m — 1)-dimensional 
oriented simple surface ds*. Let œ be a differential form of order 
m — 1 with coefficients that have continuous first derivatives. Stokes’s 
general theorem asserts that 


(41) J “se f o = f SA f do. 


We shall first treat the special case where m = M, which is Gauss’s 
divergence theorem in m dimensions. In this case, we take t as the 
whole space, s* as an oriented set that is the closure of an open set 
bounded by an (m — 1)-dimensional simple surface ds* oriented 
positively with respect to s*. The form œ of degree m — 1 can be 
written as 
a1dx2dx3 + + + dxm + a2zdx3dx4+ + . dxmdxı + °» 


+ Gm dx1 dx2 + + *dXm-1, 
where the ai are functions of x1, . . ., Xm. Then, 


(42a) dw = dai dxz dx3 + + +> dxm + daz dx3 dx4 + + + dXm dx1 + 


© » © + dam dx1 dx2+ + + dXm-1 


1Not just relatively closed. 
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= ĈU der drz» + däm + Š dive dxs + + + dm dies + e. 


4, 9am 


+ an Axm ax1* + * dXm-1 


= Kdx-- e AXm, 
where 


daz az 


a + (=m S22 y 228g (ay Sy 


(42b) K= ax 


_1\m-1 Ohm 
+ (—1)™ Ax 


m 


The proof of formula (41) for this case proceeds exactly as in the 
special case m = 3 discussed on pp. 639-642, and there is no point in 
recapitulating the individual steps. The only item to be checked is the 
sign in the final formula. The proof finally reduces to the case where 
a2, . . ., @m Vanish identically and aı vanishes outside a neighbor- 
hood of a point P of the surface o*. Here near P the surface is given 
by an equation 


xı = g(x2, . . ., Xm) 
and s* is given by the inequality 
[x1 — g(x2, . . ., Xm)]ð < 0, 
where 5 = + 1. Let the number £ = + 1 be defined at P by 
Q(s*) = 66Q(x1, . . ., Xm) 


[see (40b)]. Then, by (42a, b), 


f: oe fae = gð f [2 ax e e e dim = eff aidx2-*+ dxm 
s* xı T1=9 


On the other hand [see (40b) and (40c)], we also have 


f- ° .fo=ef: e . f ar dss: ° e dXm. 
ds* T1=9 


This completes the proof of the divergence theorem. 
The general Stokes formula for arbitrary m < M is an immediate 
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consequence. Using partitions of unity, it is again sufficient to estab- 
lish it for differential forms that vanish outside a neighborhood of 
a point P of the simple surface t. In that neighborhood t is identical 
with an elementary surface. Introducing local parameters w1,.. ., 
Um to describe t, the identity (41) goes over into the corresponding 
identity in m-dimensional parameter space, where now everything is 
reduced to Gauss’s divergence theorem discussed above. In this way, 
the general Stokes theorem is established. 

This kind of argument makes it pretty clear that the fact that our 
m-dimensional surface t is embedded in a euclidean space of dimen- 
sion M is rather irrelevant. All that counts are the local parametric 
representations mapping t onto a set in euclidean m-space. This sug- 
gests that similar formulae will hold on more general m-dimensional 
abstract manifolds that near every point can be described by pa- 
rameters. However, in order to avoid topological considerations be- 
yond the scope of this book, we have restricted ourselves to simple 
surfaces in euclidean spaces. 


CHAPTER 
6 


Differential Equations 


We have already discussed special cases of differential equations 
in Volume I, Chapter 9. We cannot attempt to develop the general 
theory in detail within the scope of this book. In this chapter, how- 
ever, starting with further examples from mechanics, we shall give 
at least a sketch of some of the principles of the subject, making use 
of the calculus of functions of several variables. 


6.1 The Differential Equations for the Motion of a Particle in 
Three Dimensions 


a. The Equations of Motion 


In Volume I (Chapter 4, pp. 397-423), we discussed the motion of 
a particle constrained to move in the x, y-plane. We now drop this 
restriction and consider a mass m that we suppose concentrated at 
a point with coordinates (x, y, z). The position vector from the origin 
to the particle has components x, y, z and we denote it by R. A motion 
of the particle will then be represented mathematically if we can 
express (x, y, z) or R as a function of the time t. If, as before, we denote 
differentiation with respect to the time t by a dot, then the vector 
R = (x, ï, 2) of length 


(1) v= Vx? + y2 + 2? 


represents the velocity, and the vector Ë = (Xx, Y, Z), the acceleration 
of the particle. 

The fundamental tool for determining the motion is Newton’s 
second law1, according to which the product of the acceleration vector 


1*Mutationem motus proportionalem esse vi motrici impressae, et fieri secundum 
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R and the mass m is equal to the force vector F = (x, y, 2) acting on 
the particle: 


(2a) mR = F, 
or, in components, 


(2b) mx = X, my = Y, mz = Z. 


These relations! can be used to find.the motion, provided we are given 
sufficient information about the force F. 

One example is the constant field of force representing gravity near 
the surface of the earth. If we take gravity as acting in the direction 
of the negative z-axis, we know the force to be represented by the 
vector 


(3) F = (0, 0, —mg) = —mg(grad 2), 


where g is the constant acceleration due to gravity (see Volume I, 
p. 399). 

Another example is the field of force produced by a mass u con- 
centrated at the origin of the coordinate system and attracting ac- 
cording to Newton’s law of gravitation (see Volume I, p. 413). If r = 
vV x2 + y? + z2? = |R| is the distance of the particle (x, y, z) with mass 
m from the origin, the field of force is given by the expression 


(4a) F = umy (grad *), 


where y is the universal gravitational constant. In this case, New- 
ton’s law of motion (2a) states that 


(4b) R = py grad 1 
or, in components, 


ve x os y 2s z 
x= uY a Y = HY p3 z= -HY 73 


lineam rectam qua vis illa imprimitur” (i.e., “Change of motion is proportional to the 
force applied and takes place in the direction of the straight line in which the 
force acts’). _ 

1The vector mR is called the momentum, so that Newton’s law states that “force 
equals the rate of change of momentum”. 
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In general, if F is a given field of force with components X(x, y, 2), 
Y(x, y, 2), Z(x, y, 2), which are known functions of position, the 
equations of motion 


mx = X(x, y, z) my = Y(x,y, z) mz = Zx, y, 2) 


form a system of three differential equations for the three unknown 
functions x(t), y(t), z(t). The fundamental problem of the mechanics 
of a particle is to determine the path of the particle from the differ- 
ential equations, when at the beginning of the motion, say at the time 
t = 0, the position of the particle [i.e., the coordinates xo = x(0), yo = 
y(0), zo = 2(0)] and the initial velocity [i.e., the quantities xo = x(0), 
yo = (0), Zo = 2(0)] are given. The problem of finding three functions 
that satisfy these initial conditions and also satisfy the three differ- 
ential equations for all values of t is known as the problem of the 
solution or integration} of the system of differential equations. 


b. The Principle of Conservation of Energy 


The equations of motion (2a) for a particle have an important 
consequence obtained by forming the scalar product with the velocity 
vector R: 


(6a) mR-R=F-R= Xx + Yy + Zé. 
Here the left-hand side can be written as 


d d 1 


(6b) “(5 mR - R) = — —mv?, 


dt 2 

that is, as the time derivative of the kinetic energy 4mv? (energy of 
motion) of the particle. Integrating equation (6a) with respect to t 
from to to tı, we find that the change in kinetic energy of the particle 
during the time interval from to to tı is given by 


1 a =f" dx, ydy , zdz 
(6c) g MN = g mo = | dat dt 4 a) 


_ [eax + Ydy + Z d2), 
where the line integral is extended over the path described by the 
particle during the time from to to tı. The integral 


1The word is used here because the solution of differential equations may be re- 
garded as a generalization of the process of ordinary integration. 
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taken over an oriented arc is called the work done by the force F = 
(X, Y, Z) in moving along this arc.! Hence, (6c) can be stated as the 
equation of energy: The gain in kinetic energy is equal to the work 
done by the force during the motion. 

In the important case where the field of force can be represented 
as the gradient of a function, say 


(Ta) F = grad g, 
the integral of the differential form 


X dx + Y dy + Zdz=dg¢ 


is independent of the path and depends only on the initial and final 
points of the path (see p. 95). Following Helmholtz, a field of force 
of the type (7a) is called conservative.2 We introduce the potential 
energy U (energy of position) of the conservative force field by U = —4¢. 
The equations of motion then have the form 


mR = —grad U 
or, in components, 
(7b) mx = — Urz, my = — Uy, mz = — Uz. 


The potential energy as a function of position (x, y, z) is determined 
by the force field only within an arbitrary additive constant. For the 
work done by the conservative forces during the motion we find 


fXdx+ Ydy + Z dz = — [dU = Uo — U: 


1See Volume I, p. 420. Introducing the arc length s as parameter, the line integral 


takes the form 
dR 
Í F- ds ds 


and thus is equal to the limit of the sums of the component of force in the direction 
of motion multiplied with the distances. 

2““Conservative’’ by virtue of the theorem of the conservation of energy, which we 
shall deduce shortly. 
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where Uo and U; are the respective values of the potential energy for 
the positions of the particle at the times to and tı. Comparison with 
(6c) shows that 


1 muy? + Ui = 1 mvo? + Uo. 

2 2 
Hence, the quantity 4+mv? + U has the same value at any times to 
and ti during the motion. Without going into the physical explana- 
tion of these concepts, we have arrived at a form of the law of con- 
servation of energy for a particle in a conservative field of force: 


The total energy—that is, the sum of the kinetic energy + mv? and of 
the potential energy U—remains constant during the motion. 

In the examples in the next sections we show how this theorem 
can be used in the actual solution of the equations of motion. 

We notice that both the force fields defined by equations (3) and 
(4a) are conservative. The equations of motion under the uniform 
gravitational field (3) reduce to 


(8a) x=0, y=0, Z= —g. 


Their general solution trivially is given by 
1 
(8b) x= ait + a2, y = bit + ba, z= -7 et? + cit + c2. 


Here, obviously, the constants (a2, b2, c2) give the initial position, 
and the constants (a1, b1, cı), the initial velocity of the particle at the 
time ¢ = 0. The trajectory of a particle given parametrically in terms 
of the time t by equations (8b) is a parabola with axis parallel to the 
z-axis. Since the force fieldis — mg grad z, the potential energyis U = 
mgz + constant. Changes in U are proportional to changes in ele- 
vation z. The law of conservation of energy thus takes the form 


(8c) mv + mgz = constant => mvo? + mgZo 
= Zma: + bi? + c1?) + mgee. 


The velocity v is therefore least at the highest point of the trajectory. 

Instead of a freely falling particle, we can consider a particle 
moving under the influence of the gravitational field F = —mg grad z, 
where the particle is constrained to stay on a surface z = f(x, y) 
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by a reaction force perpendicular to the surface.! Since the reaction 
force has no component in the direction of motion, and hence does 
no work, the work done during the motion is that done by the con- 
servative gravitational field. We arrive thus at the same equation 
of energy 


(9) 5 mv? + mgz = constant, 


as for the freely falling body, the only difference being that z = f(x, y) 
is now a prescribed function of the coordinates x, y. 


c. Equilibrium. Stability 


The equations of motion 
(10a) mR = —grad U 


of a particle in a conservative force field enable us to discuss motions 
near a position of equilibrium. We say that the particle is in equi- 
librium under the influence of the field of force if it remains at rest. 
In order that this may be the case, its velocity and its acceleration 
must both be 0 throughout the interval of time under consideration. 
The equations of motion (10a) therefore yield 


(10b) grad U = 0 
or 
(10c) Uz = Uy = U: = 0 


as the necessary conditions for equilibrium. Thus, a position of 
equilibrium (xo, yo, zo) necessarily is a critical point of the potential 
energy U. Conversely, every critical point (xo, yo, 20) of U is a possible 
position of rest, since obviously the constant vector 


R = (xo, yo, 20) 


satisfies the equations (10a). 
Of great practical importance is the notion of stability of equilibri- 
um. We mean by stability that if we slightly disturb the state of 


1An example is furnished by the spherical pendulum where a mass is constrained to 
move on a sphere. Compare with the motions on a curve discussed in Volume I, pp. 
405 ff. 
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equilibrium, the whole resulting motion will differ only slightly from 
the state of rest.1 More precisely, let rı and vı be any positive num- 
bers. We can find corresponding to ri and vı two positive numbers 
ro, Vo SO small that if the particle is moved a distance not more than 
ro from its position of equilibrium and started off with a velocity not 
greater than vo, then in its whole subsequent motion it will never 
reach a distance greater than ri from the point of equilibrium and a 
velocity greater than vı. 

It is particulary interesting that the equilibrium is stable at a 
point at which the potential energy U has a strict relative minimum.? 
It is remarkable that we can prove this statement about stability 
without actually solving the equations of motion. For simplicity, we 
assume that the position of equilibrium under consideration is the 
origin, which we can always bring about by a translation. Moreover, 
since the potential energy is only determined within a constant, we 
can assume that U(0, 0, 0) = 0. Since U has a strict relative minimum 
at the origin, we can find a positive number r < ri such that U > 0 
everywhere on the surface of the sphere of radius r about the origin 
and in its interior, except at the origin. The minimum value of U on 
the surface of the sphere is then a positive number a. Since U is con- 
tinuous, we can find an ro < rsuch that U(x, y, z) < 4a and U(x, y, z) 
<imvi? in the solid sphere of radius ro about the origin. Let, 
moreover, the positive number vo be so small that 4muo2 < 4a and 
mvo? < 4mu1?2. Then, for an initial position of the particle of distance 
less than ro from the origin and an initial velocity less than vo, we 
have initially for the total energy the inequalities 


(11a) > mv? + U(x, y, z) í 5 MUo? + 5 a<a 
(11b) > mv? + U(x, y, z) < a mu? + + muy? = > mui2. 


1The notion can be illustrated best by the analogous two-dimensional problem of a 
particle moving under gravity but constrained to stay on a surface z = f(x, y). Here 
the positions of equilibrium are the critical points of the potential energy mgz = 
mgf(x, y), that is, the highest or lowest points or saddle points ofthe surface z = f(x,y). 
The equilibrium is stable for a particle resting, say, under the influence of gravity 
at the lowest point of a spherical bowl, which is concave upward. On the other hand, 
a particle resting at the highest point of a spherical bowl that is concave downward 
is in unstable equilibrium; the slightest disturbance results in a large change of 
position. Since the small disturbances can always be assumed to be present in 
practice, unstable equilibrium is not maintained and unlikely to be observed. 

2At a strict minimum point the value of U is lower than at all other points of a suf- 
ficiently small neighborhood. See page 325-6 for the definitions. 
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Since the energy is constant throughout the motion, we see from 
(11a) that at all subsequent times 


= mv? + U(x, y, z) <a, 


and consequently, 
U(x, y, 2) <a. 


Since initially the particle is inside the sphere of radius r and since 
U => aon that sphere, the particle can never reach the surface of the 
sphere. This shows that the distance of the particle from the origin 
never exceeds the value r < ri. Since also U = 0 inside the sphere 
of radius r, it follows from (11b) that 

I ee L mp2 

2 muU“ < 9 mvi 
and, consequently, that the velocity of the particle never exceeds the 
value v1, as was to be proved. 


d. Small Oscillations About a Position of Equilibrium 


The motion of a particle about a position of stable equilibrium, 
corresponding to a minimum of the potential energy, can be approxi- 
mated in a simple way. For the sake of brevity, we restrict ourselves 
to a motion in the x, y-plane and assume that there is no force acting 
in the direction of the z-axis. We also assume that the potential U (x, y) 
has a minimum at the origin and that U(0, 0) = 0. Moreover, at the 
minimum point, U = Uo = 0. We imagine U expanded by Taylor’s 
theorem in the form 


U = > (ax? + 2bxy + cy?) + e., 


The function U will have a strict relative minimum at the origin if 
the quadractic form 


(12a) Q(x, y) = > (ax? + 2bxy + cy?) 


is positive definite,! that is, that 


1See page 347. The positive definite character of Q is sufficient, but not necessary, 
for a strict relative minimum. However, it is necessary that Q be neither indefinite 
nor negative definite. 
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(12b) a>0, ac— b >Q. 


We assume that conditions (12b) are satisfied and that in a sufficiently 
small neighborhood of the position of equilibrium at the origin the 
potential energy U can be replaced with sufficient accuracy by the 
quadratic form Q. 1 With these assumptions the equations of motion 
take the form 


oe 


mR = — grad Q 
or 


(12c)? mx = —ax — by, my = —bx — cy. 


The equations (12c) can be integrated completely if we first rotate 
the x- and y-axes through a suitably chosen angle ¢ so that the new 
coordinate axes coincide with the principal axes of the ellipses Q = 
constant. We make the orthogonal substitution 


1No serious attempt at justifying this “plausible” assumption can be made here. 
2We again can interpret these equations as approximating the equations of motion 
under gravity of a particle constrained to move on a surface z = f(x, y) near a mini- 
mum point of that surface. The precise equations of motion here have the form 
X = — Afz, J=—MNy, Z=—gti, 

taking into account that the forces acting on a particle consist of the gravitational 
force (0, 0,—mg) and a reaction force (—A fz,—A fy, 4) perpendicular to the surface and 
containing an indeterminate multiplier 4. We can eliminate à by observing that 


2 
I= ot = fcX + fuð + fexX? + 2fryžý + fyyy? 


and find the equations 
X = — fz, ï = — fuy 
with 
_ + frak? + 2fzyžý+ fuð 
1+ ft + hi? 
for the two unknown functions x, y. If f has a minimum at the origin and is approxi- 
mated there by the quadratic 


A 


(13a) = 3 (ax? + 2Bxy + vy"), 


we find near the origin, neglecting all nonlinear terms, the differential equations 
(13b) X= —glax + By), = —gBx + vy), 
which are of the form (12c). If, for example, the surface is the sphere 
z=L-vVI xy 
(“spherical pendulum of length L”), we find 


„L Zg .  f 
(13c) x= L x, y L J. 
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x = É cos ġ — n sing, y= ¢& sin ø + y cos g, 


where ¢ is determined from the condition that 
l ax? 2 l gee 2 
Q = -z (ax? + 2bxy + cy?) = 5 (06 + m’) 


with suitable positive constants a, y '. In the new rectangular 
coordinates &, n the equations of motion (12c) transform into 


(14a) m6=—at, mij=— yn. 


As in Volume I (p. 404), both these equations can be integrated com- 
pletely. We obtain 


(4b) =A sin, /* t-a), 1 = Agsin Jt (t — co), 
m m 


where cı, C2, A1, Az are constants of integration that enable us to 
make the motion satisfy any arbitrarily assigned initial conditions.? 

The form of the solution shows that the motion about a position 
of stable equilibrium results from the superposition of simple har- 
monic oscillations in the two principal directions, the €-direction and 
the y-direction, the frequencies of these oscillations being given by 
va/m and vy/m. 3 A general discussion of these oscillations, which 
we shall not carry out here, shows that the resultant motion may take 
a great variety of forms. 

To give a few examples of these compound oscillations, we first 
consider the motion represented by the equations 


E = sin (t + c), n = sin (t — c) 
By eliminating the time t, we obtain the equation 


1One finds immediately that ¢ is determined from the equation 


2b 
a— c’ 


tan 2¢ = 


The positivity of a, y follows from the positive definiteness of Q. 

2It is of interest to observe that in cases of unstable equilibrium, one or both of the 
constants a, y might be negative. In that case, the trigonometric functions oc- 
curing in (14b) would have to be replaced by hyperbolic ones and the coordinates 
E, n do not both stay bounded for all t. 

3In the case (13c) of the spherical pendulum, the two frequencies have the same value 


VgiL. 
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(E + n}? sin? c + (E — n)? cos? c = 4 sin? c cos? c, 


which represents an ellipse. The two components of the oscillation 
have the same frequency 1 and the same amplitude 1, but a difference 
of phase 2c. If this difference of phase successively takes all values 
between 0 and 1/2, the corresponding ellipse passes from the de- 
generate straight-line case €— n = 0 tothe circle €2 + n? = 1, and the 
oscillation passes from the so-called linear oscillation to the circular 
(cf. Figs. 6.1-6.3). 


ay 2A, eh 
/ Lp i 


Figures 6.1-6.3 Oscillation diagrams. 


If, as a second example, we consider the motion represented by the 
equations 


E = sin ft, n = sin 2(t — oc), 


where the frequencies are no longer equal, we obtain oscillation 
diagrams decidedly more complicated. In Figs. 6.4-6.6 these curves 
are given for the phase differences c = 0, c = 7/8, and c = 7/4, re- 
spectively. In the first two cases, the particle moves continuously on 
a closed curve, but in the last case, it swings backward and forward 


th aN 


Figures 6.4-6.6 Oscillation diagrams. 
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on an arc of the parabola n = 262 — 1. The curves obtained by the 
superposition of different simple harmonic oscillations in directions 
at right angles to one another are given the general name of Lis- 
sajous figures. 


e. Planetary Motion 


In the examples discussed above, the differential equations of the 
motion can immediately (or after a simple transformation) be written 
in such a way that each of the coordinates occurs in one differential 
equation only and can be determined by elementary integration. We 
shall now consider the most important case of a motion in which the 
equations of motion are no longer separable in this simple way, so that 
their integration involves a somewhat more difficult calculation. The 
problem in question is the deduction of Kepler’s laws of planetary 
motion from Newton’s law of attraction. We suppose that at the origin 
of the coordinate system there is a body of mass ų (e.g., the sun) whose 
gravitational field of force per unit mass is given by the vector 


1 
yu grad = 


What is the motion of a particle of mass m (a planet) under the in- 
fluence of this field of force? The equations of motion are (see p. 655) 


oe x e y » z 
(15) X= — Ws, Y= Ws, 2=— WG. 


In order to integrate them, we first state the theorem of conservation 
of energy (see p. 658) for the motion in the form 

> m (a2 + 32 + 22) — WM = ©, 

2 r 
where C is constant throughout the motion and is determined by the 
initial conditions. 

From the equations of motion (15) we can deduce other equations 
in which only the components of the velocity, not the acceleration, 
are present. If we multiply the first equation of motion by y, the 
second by x, and then subtract, we obtain 


xy —xy=0 or T (ày — ja) = 0, 


1The special case of circular motion has been discussed in Volume I (pp. 413 ff.). 
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whence, by integration, we have 
xý — yX = Cı. 


Similarly, from the remaining equation of motion we obtain! 


yž — Zý = C2, 2X — XZ = C3. 


These equations enable us to simplify our problem very considera- 
bly in a way that is highly plausible from the intuitive point of view. 
Without loss of generality, we can choose the coordinate system in 
such a way that at the beginning of the motion, that is, at t = 0, the 
particle lies in the x, y-plane and its velocity vector at that time also 
lies in that plane. Then z(0) = 0, and ż(0) = 0; and by substituting 
these values in the above equations and remembering that the right- 
hand sides are constants, we obtain 


(16a) xy — y% =a = hħh, 
(16b) yz — zy = 0, 
(16c) 2x — xz = 0. 


From these equations we conclude in the first place that the whole 
motion takes place in the plane z = 0. Since we naturally exclude the 
possibility of an initial collision between the sun and planet, we as- 
sume that initially the three coordinates (x, y, z) do not vanish 


1We can also arrive at these three equations using vector notation if we form the 
vector product of both sides of the equation of motion and the position vector R. 
Since the force vector is in the same direction as the position vector, we obtain zero 
on the right, while the expression R x R on the left is the derivative of the vector 
R x R with respect to the time. It therefore follows that this vector R x R = C has 
a value constant in time; this is exactly what is stated by the coordinate equations 
above. 

As we see, this equation does not depend on our special problem but holds in 
general for every motion in which the force has the same direction as the position 
vector. 

The vector R x R is called the moment of velocity and the vector mR x R the mo- 
ment of momentum of the motion. From the geometrical meaning of the vector pro- 
duct we easily obtain the following intuitive interpretation of the relation just given 
(cf. the subsequent discussions in the text). If we project the moving particle on to 
the coordinate planes and in each coordinate plane consider the area that the radius 
vector from the origin to the point of projection sweeps over in time t, this area is 
proportional to the time (theorem of areas). 
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simultaneously, so that at the time ¢ = 0 at which 2(0) = 0, we have, 
say, x(0) # 0. Now, from (16c), it follows that 
2 (z) _ _ 2% — 24 _ o 
dt\x/] — x? ` 


Therefore, z = ax, where a is a constant. If we put t = 0 here, then 
from the equations z(0) = 0 and x(0) + 0, it follows that a = 0, so 
that z is always 0. 

We therefore reduce our problem to integration of the two dif- 
ferential equations 


(17a) + m(i? + 32) -— UE = © 
(17b) xy — yk =h. 


We next use the equations x = r cos 9, y = r sin 9 to transform the 
rectangular coordinates (x, y) into the polar coordinates (r, 9), which 
are now to be determined as functions of t. Since 


X2 + y2 = F2 + 262, xý — yx = 76, 


we have the two differential equations 
(17c) > m (F2 + 7262) — yem = C, 


(17d) r= h 


for the polar coordinates r, 0. The first of these equations is the 
theorem of the conservation of energy, while the second expresses 
Kepler’s law of areas. In fact (cf. Volume I, pp. 371-372) the expres- 
sion +76 is the derivative with respect to the time of the area swept 
out in time ¢ by the radius vector from the origin to the particle. This 
is found to be constant, or, as Kepler expressed it, the radius vector 
describes equal areas in equal times. 

If the area constant h is zero, Ô must vanish; that is, 0 must remain 
constant, so that the motion must take place on a straight line 
through the origin. We exclude this special case and expressly assume 
that h + 0. 
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In order to find the geometrical form of the orbit, we shall no 
longer describe it parametrically in terms of the time! but consider 
the angle 9 as a function of r or r as a function of 8, and from our two 
equations we calculate the derivative dr/d® as a function ofr. 

If we substitute the value 6 = h/r2 from the area equation in the 
energy equation and recall the equation 


we at once obtain the differential equation of the orbit in the form 


CEES 


2 r? r 
or 
(Te) (ae) = "(mat + er a 


To simplify the later calculations, we make the substitution 


and introduce the following abbreviations: 


2Ch2 
my2p2 * 


= Th e? = 1+ 
The differential equation (17e) then becomes 
af = A) 

dð) p® pl’ 

and this can be integrated immediately. We have 
du 
0 — 0 = f uH, 
v(e?/p? — (u — 1/p)’) 

1The course of the motion as a function of the time can be determined subsequently 
by means of the equation 


f r2? d0 = A(t — to), 
80 


in which we suppose that r is known as a function of 0 (cf. p. 670). 
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or, if for the moment we introduce u — 1/p = v as a new variable, 


du 
s-e = f Veep) — v8” 
For the integral [by Volume I, p. 270, formula (24)] we obtain the 
value arc sin (up/e) and thus find the equation of the orbit in the form 


=v= 


E€ 
— sin (8 — Qo). 

D ( 0) 

The angle 8 can be chosen arbitrarily, since it is immaterial from 
which fixed line the angle 0 is measured. If we take 8o = x/2—that 
is, if we let v = 0 correspond to the value 0 = n/2—we finally obtain 
the equation of the orbit in the form 


P 


"= 1—€cos0° 


This is the familiar equation in polar coordinates of a conic having 
one focus at the origin.) 
Our result therefore gives Kepler’s law: 


The planets move in conics with the sun at one focus. 

It is interesting to relate the constants of integration 
h2 

yp? 


2Ch2 
my? 


p €?= 1 + 
to the initial motion. The quantity p is known as the semi-latus rec- 
tum or parameter of the conic; in the case of the ellipse and the 
hyperbola it is connected with the semiaxes a and b by the simple 
relation 
b2 

p= a: 
The square of the eccentricity, £?, determines the character of the 
conic; it is an ellipse, a parabola, or a hyperbola, according to whether 
£? is less than, equal to, or greater than 1. 

From the relation 


1This is seen easily by transforming the equation to rectangular coordinates: 


(x — a)? + —— =a? (a= P ): 


1 — €? l1- e? 
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2Ch2 


2 = 
€ 1+ my2p2 


we see at once that the three different posstblities can also be stated 
in terms of the energy constant C; the orbit is an ellipse, a parabola, 
or a hyperbola, according to whether C is less than, equal to, or 
greater than zero. 

If we suppose that at time ¢ = 0 the particle is at the point Ro in 
the field of force and is moving with initial velocity Ro, then the 
relation 


C= 1 mug — Hi 
2 ro 
gives the suprising fact that the character of the orbit—ellipse, 
parabola, or hyperbola—does not depend on the direction of the initial 
velocity at all, but only on its absolute value vo. 
Kepler’s third law is a simple consequence of the other two: 


For a planet in elliptic orbit the square of the period bears a con- 
stant ratio to the cube of the major semiaxis, the ratio depending on 
the field of force only and not on the particular planet. 

If we denote the period T and the major semiaxis by a, we should 
then have 


T2 
—; = constant, 
a3 


where the constant on the right is independent of the particular prob- 
lem and depends only on the magnitude of the attracting mass and on 
the gravitational constant. 

To prove this we use the theorem of areas (17d) in the integrated 
form 


0 
f r? dð = h(t — to), 
60 


which defines the motion as a function of the time. If we take the 
integral over the interval from 0 to 2r, we obtain on the left twice 
the area of the orbital ellipse, and that, by previous results, is 2nab; 
on the right the time difference t = to is replaced by the period T. 
Therefore, 


2nab = hT or 4n2a2b?2 = h?T?, 
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We already know that h? is connected with the a and b of the orbit 
by the relation h?/yp = p = b2/a. If we replace h? in the above 
equations by (b2/a) yp, it follows at once that 


which exactly expresses Kepler’s third law. 


Exercises 6.le 


1. Treat in detail the motion of an orbiting body in a straight line trajectory 
[h = 0 in equation (17d)]. 

2. Prove that as t co the velocity v of a planet tends to 0 if its orbit 
is a parabola and to a positive limit if it is a hyperbola. 

3. Prove that a body attracted toward a center 0 by a force of magnitude 
mr moves on an ellipse with center 0. 

4. Prove that the orbit of a body repelled by a force of magnitude f(r), where 
fis a given function, from a center 0 is given in polar cordinates (r, 9) by 


o= [n 
=f r?42c]h? + 2f" f(r) dr[h? — ifr?. 


5. Prove that the equation of the orbit of a body repelled with a force 
u/r? from a center 0 is 


i | gap cn 0 + 9 for u< k? 


cosh (k0 + £) for u> h? 


k = Ji- 


and e is a constant of integration. 


6. A planet is moving on an ellipse, and w = «(t) denotes the angle P’ MP,, 
where P’ is the point on the auxiliary circle corresponding to P, the posi- 
tion of the planet at that time t; Ps its position at the time ts when it is 
nearest to the sun S; and M the center of the ellipse. Prove that w and 
t are connected by Kepler’s equation 


h(t — ts) = ab(w — e sin o). 


lä 
if 


7. Prove that in a central field of force the attraction p per unit mass is 
given by 
— k dq 
@? dr’ 


672 Introduction to Calculus and Analysis, Vol. II 


where q is the distance of the tangent of the orbit from the pole and h 
the area constant (p. 667). Hence prove that the cardioid r = a(1 + cos 9) 
can be described under an attraction to the pole equal to ur~4 per unit 
mass. 


8. A particle of unit mass moves under the action of two forces, of which the 
first is always toward the origin and is equal to (2 times the distance of 
the particle from that point, while the second 1s always at right angles to 
the path of the particle and is equal to 2u times its velocity. Prove that if 
the particle is projected from the origin along the axis of x with velocity 
u, its coordinates at any subsequent time ¢ are 


u — 
x = == 1 2 2 , 
= ape sin (VA2 + u? t) cos ut 


y= ET sin (VA? + p? t) sin ut. 

9. Let there be n fixed particles in a plane, all attracting with a central force 
of magnitude 1/r. Prove that there are not more than n — 1 positions of 
equilibrium for a particle in the field. 

Calculate these positions for the case of four attracting particles with 
coordinates (a, b), (a, — b), (— a, b), (— a, — b), where a > b > 0. 


f. Boundary Value Problems. The Loaded Cable and the Loaded Beam. 


In the problems of mechanics and the other examples previously 
discussed, we selected from the whole family of functions satisfying 
the differential equation a particular one by means of so-called initial 
conditions; that is, we chose the constants of integration in such a 
way that the solution and, in certain cases, some of its derivatives 
assume preassigned values at a definite point. In many applications 
we are concerned neither with finding the general solution nor with 
solving definite initial-value problems but with solving a so-called 
boundary value problem. In a boundary value problem we seek a 
solution that satisfies preassigned conditions at several points and 
satisfies the differential equation in the intervals between those 
points. Here we shall discuss a few typical examples without going 
into the general theory of such boundary value problems. 


Example 1—The Differential Equation of a Loaded Cable 


In a vertical x, y-plane—in which the y-axis is vertical—we suppose 
that a cable with (constant) horizontal component of tension S is 
stretched from the origin to the point x = a, y = b, (cf. Fig. 6.7). The 
cable is acted on by a load whose density per unit length of horizontal 
projection is given by a sectionally continuous function p(x). Then 
the sag y(x) of the cable, that is, the y-coordinate, is given by the 
differential equation 
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Figure 6.7 Loaded cable. 


(18) y"(x) = g(x) g(x) = “5. 


The shape of the cable will then be given by that solution y(x) of the 
differential equation that satisfies the conditions y(0) = 0, y(a) = b. 
The solution of this boundary value problem can be written down at 
once, since the general solution of the homogeneous equation y” = 0 
is the linear function co + cix, and the solution of the nonhomo- 
geneous equation that, with its first derivative, vanishes at the origin 


is given by the integral f” g(&)(x — &) d& [see (42), p. 78]. In the 
general solution 


w(x) = co + cix + | gE — E dé 


the condition y(0) = 0 at once gives co = 0, and then the condition 
y(a) = b determines c, through the quation 


b= cia + |, O0- E) dk 


In practice, we must often deal with a more complicated form 
of this boundary value problem in which the cable is subject not 
only to the continuously distributed load but also to concentrated 
loads, that is, loads that are concentrated at a definite point of the 
cable, say, at the point x = xo. Such concentrated loads we shall con- 
sider as ideal limiting cases arising as € >0 from a loading p(x) 
that acts only in the interval xo — £ to xo + £ and for which 


Tote 
JaPa) ax = P, 
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In this, the total loading P remains constant during the passage 
to the limit £ > 0; the number P is then called the concentrated load 
acting at the point xo.! By integrating both sides of the differential 
equation y” = p(x)/S over the interval from x — £ to x + £ before 
making the passage to the limit £ > 0, we see that the equation 
y'(xo + £) — y'(xo — £) = P/S holds. If we now perform the passage 
to the limit £ > 0, we obtain the result that a concentrated load P 
acting at the point xo corresponds to a jump of the derivative y'(x) 
by an amount P/S at the point xo. 

The following example shows how the presence of a concentrated 
load modifies the boundary value problem. We suppose that the 
cable is stretched between the points x = 0, y=Oandx=1,y=1 
and that the only load is a concentrated load of magnitude P acting 
at the midpoint x = +. This physical problem corresponds to the fol- 
lowing mathematical problem: to find a continuous function y(x) 
that satisfies the differential equation y” = 0 everywhere in the in- 
terval 0 < x < 1 except at the point xo = +; that takes the values 
y(0) = 0, y(1) = 1 on the boundary; and whose derivative has a jump 
of the amount P/S at the point xo. In order to find this solution, we 
express it in the following way: 


y(x) = ax + b Osx +) 
and 
y(x) = (1 — x) + d (4 <x<1). 


The condition y(0) = 0, y(1) = 1 gives b = 0, d = 1. From the con- 
dition that both parts of the function shall give the same value at the 
point x = 4, we find that 


1 1 
9% = get i. 


1One often thinks of the concentrated load as described purely formally by a dis- 
tributed load 


p(x) = P &(x — xo), 
where (x) stands for a generalized function (the so-called Dirac function) for which 
d(x) =0 for x0 and f i d(x) dx = 1, 


with no value assigned to 5(0). No finite value of 5(0) would be compatible with the 
other conditions imposed. 
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Finally, the requirement that the derivative y shall increase by the 
amount P/S on passing the point 4 gives the condition 


= -5 


These conditions yield 
P P 
a=1-— 955: b = 0, c=—1— 935; 
and our solution has been found. Moreover, no other solution with 
the same properties exists. 
Example 2—The Loaded Beam} 


The treatment of a loaded beam is very similar (cf. Fig. 6.8). Let us 
suppose that in its position of rest the beam coincides with the 


d = 1, 


Figure 6.8 Loaded beam. 


x-axis between the abscissas x = 0 and x = a. Then itis found that the 
sag (vertical displacement) y(x) due to a force acting vertically in the 
y-direction is given by the linear differential equation of the fourth 
order 


(19a) y” = (x), 


where the right-hand side (x) is p(x)/EI, p(x) being the density of 
loading, E the modulus of elasticity of the material of the beam (E is 
the stress divided by the elongation), and J the moment of inertia of 
the cross section of the beam about a horizontal line through the 
center of mass of the cross section. 

The general solution of this differential equation can at once be 
written [(42), p. 78] in the form 


y(x) = Co + cix + Cox? + c3x? + IKO wS dé, 


1For the theory of loaded beams, cf. v. Karman and Biot, Mathematical Methods in 
Engineering. 
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where Co, C1, Cz, C3 are arbitrary constants of integration. The real 
problem, however, is not that of finding this general solution but 
of finding a particular solution, that is, of determining the constants 
of integration in such a way that certain definite boundary conditions 
are satisfied. If for example, the beam is clamped at the ends, the 
boundary conditions 


y0) =0, yva=0, yO=0, y@=0 


hold. It then follows at once that co = cı = 0, and the constants c2 
and c3 are to be determined from the equations 


oa? + cs + | oO 2a ak =o, 
2cea + 3c3a? + in o(E) Mo oh a dé = 0. 


For beams, too, the problem of concentrated loads is important. 
We again think of the concentrated load acting at the point x = xo 
as arising from a loading p(x), distributed continuously over the 


interval xo — £, to x0 +8, for which f, oe p(é&) dE = P; we again 
let £ approach zero and at the same time let p(x) increase in such 
a way that the value of Premains constant during the passage to the 
limit € > 0. P is then the value of the concentrated load at x = xo. 
Just as in the example above, we integrate both sides of the differen- 
tial equation (19a) over the interval from x — £ to x + £ and then 
pass to the limit as £ > 0. It is found that the third derivative of the 


solution y(x) must have a jump at the point x = xo, amounting to 


(19b) y” (x0 + 0) — y” (x9 — 0) = 
Here y(xo + 0) means the limit of y(xo + h) as h tends to 0 through 
positive values, y(xo — 0) being the corresponding limit from the 
left. 

Thus, the following mathematical problem arises: we attempt to 
find a solution of y”” = 0 that, together with its first and second 
derivatives, is continuous, for which y(0) = y(1) = y'(0) = y'(1) = 0, 
and whose third derivative has a jump of the amount P/EI at the 
point x = xo and elsewhere is continuous. 

If the beam is fixed at a point x = xo (cf. Fig. 6.9)—that is, if at this 
point the sag has the fixed preassigned value y = 0—we can think of 
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Figure 6.9 Sag of beam supported in the middle. 


this constraint as being achieved by means of a concentrated load 
acting at that point. By the mechanical principle that action is 
equal to reaction, the value of this concentrated load will be equal 
to the force that the fixed beam exerts on its support. The magnitude 
P of this force is then given at once by the formula [see (19b)] 


P = EI {y" (xo + 0) — y” (xo —0)}, 


where y(x) satisfies the differential equation y”” = p/EI everywhere 
in the interval 0 < x < 1 except at the point x = xo and in addition 
also satisfies the conditions y(0) = y(1) = y'(0) = y'(1) = 0, y(xo) = 0, 
and y, y’, and y” are also continuous at x = Xo. 

In order to illustrate these ideas, we consider a beam that ex- 
tends from the point x = 0 to the point x = 1, is clamped at its end 
points x = 0 and x = 1, carries a uniform load of density p(x) = 1, 
and is supported at the point x = + (cf. Fig. 6.9). For the sake of 
simplicity we assume that EI = 1, so that the beam satisfies the 
differential equation 


yi = 1 


everywhere, except at the point x = 4. 

As the formula shows, the general solution of the differential 
equation is a polynomial of the fourth degree in x, the coefficient of 
x* being 1/4!. The solution will be expressed by a polynomial of this 
type in each of the two half-intervals. For the first half-interval we 
write the polynomial in the form 


= bo + bı x + bz x2 + ba x8 + z xf, 
in the second half-interval, in the form 


y = co + c1(x — 1) + co(x — 1)? + c3(x — 1) + nE — 1). 
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Since the beam is clamped at the ends x = 0 and x = 1, it follows 
that 


(0) = y1) = y'(0) = y'(1) = 0, 


whence we obtain bo = bı = co = cı = 0. In addition, y(x), y(x), 
y” (x) must be continuous at the point x = +; that is, the values of 
y(+), y (4+), y’(4+) calculated from the two polynomials must be the 
same, and the value of y(4) must be 0. This gives 


1 1 1 1 1 1 
q 2+ gh tagama g t ag = 0, 
3 1 3 
bz + 7 bs + Fg= eat Gos 48” 


2b2 + 363 = 2c2 — 3cs. 
From this we obtain the following values for b2, bs, ce, cs: 


1 


= 9g) 03 = —¢3 = — 


b2 = C2 a 


and the force that must act on the beam at the point x = + in order 
that no sag may occur at that point is given by 


mo _ Aft 1l — o 8\_ 9ti\__81 
y"(5 +o y (5 0) = (6es z) (6bs + 5} = 2° 


6.2 The General Linear Differential Equation of the First Order 


a. Separation of Variables 


A differential equation is said to be of the first order if it involves, 
besides x and y(x), the first derivative of the function y(x) but no 
higher derivative. The most general equation of this type is 


(20a) F(x, y, y’) = 0, 


where F is a given function of its three arguments x, y, y’. We can 
assume that in a certain region of the x, y-plane the differential 
equation (20a) can be solved uniquely for y’ and thus expressed in 
the form 


(20b) y = f(x, y). 
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Explicit formulae for the general solution of a differential equa- 
tion (20b) can only be found in special cases.! The simplest situation 
arises when the function f(x, y) is the quotient of a function of x 
alone and of a function of y alone, that is, when the differential 
equation has the form 


, — x) 


l = By)’ 


In this case we can “separate” the variables x, y, writing the equation 
symbolically in the form 


(21a) 


(21b) B(y) dy = a(x) dx. 


We now introduce the two indefinite integrals 


(21c) A(x) = faa) de, BO) = | BO) dy 
obtained by ordinary quadratures. Then by (21a) 
dB dB d , dA 
IBO) = SEI) & = Biy) y = ala) = SE. 


It follows that for every solution of (21a) 
(21d) Bly) — A(x) = c, 


where c is a constant (depending on the solution).2? Equation (21d) 
may now be solved for y, assigning any value to c, and the required 
solution of (21a) is thus obtained by quadratures. 

As a matter of fact, we already have used this method of separation 
of variables in a variety of problems leading to differential equations 
(see Volume I, p. 406; Volume II, p. 668). Another type of differential 
equation that can be reduced to the form (21a) is the so-called homo- 
geneous equation 


(21e) y =f (=). 
X 
1We shall, however, discuss on p. 704 a general approximation scheme giving the 
solution of (20b) in all cases, where the function f has continuous first derivatives. 
2Instead of using the chain rule in the derivation of (21d), we could also argue that by 
(21b, c) 
d(B — A) = dB — dA = B dy — a dx = 0 

and, hence, that B — A is constant. 
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Introducing the new unknown function z = y/x, we arrive at a 
differential equation 


r, xy —y_ fe) -2 
x? e 


which is separable. The general solution is then found from the 
relation 


dz 
f(z) — z 


where c is a constant. We use this equation to express z as a function 
of x and put y = xz to obtain the required solution. 
As an example, consider the equation 


(21f) = [2 + c= c+ logial, 


corresponding to f(z) = z2. Here relation (21f) becomes 


dz z— 
Poem log 


l e+ log |x]. 


Hence, 


YZI kx? 


where k = + e is a constant. 


b. The Linear First-Order Equation 


A differential equation is called linear if it represents a linear 
relation between the unknown function y and its derivatives with 
coefficients that are given functions of x. Thus, the general first-order 
linear differential equation has the form 


(22a) y' + a(x) y = b(x) 


where a(x) and b(x) are given. 
We first suppose that b = 0. Then the differential equation is 
separable and can be written as 
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Hence, 


log |y] = -Í a(x) dx + constant. 


If we denote by A(x) any indefinite integral of the function a(x), 
that is, any function with derivative a(x), we find that 


(22b) y = ce Al) 


where c is an arbitrary constant of integration. This formula gives 
a solution, even when c = 0, namely, y = 0. 
If b(x) is not zero we seek a solution of the form 


(22c) y = u(x)e-4@) 


where A is defined as before and u(x) must be suitably determined.! 
One finds by substitution into (22a) that 


y' + ay = wea — uA'e-4 + auei = u'ea = b. 
Hence, the unknown function u must have the derivative 
u’ = b(x) eA), 
Thus, 
u=C +f b(x) eA) dx, 


where c is a constant. We find for the solution y of (22a) the 
expression 


(22d) y = eA) (c + f (a) ea% da), 
where c is any constant and 
(22e) A(x) = f a(x) dx. 


Since every function y can be written in the form (22c) with a suitable 
function u, we see that formula (22d) represents the most general 


1This device of replacing the constant c in (22b) by the variable u is known as varia- 
tion of parameters. 
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solution of (22a). Thus, the general solution is formed from known 
functions merely by exponentiation and the ordinary process of in- 
tegration. The solution really contains only one arbitrary constant, 
since any different choice of the constants of integration in A(x) or 
in the indefinite integral occuring in (22d) can be compensated for by 
a suitable change in c. 

For example, in the case of the differential equation 


y + xy = —x 


we have 


A(x) = fx dx = T 


f dea dx = — fret dx = — e"2 


and, hence, obtain the solution 
y = e272 (e — e72?!) = — 1 + ce-2?/2, 


Exercises 6.2 


1. Integrate the following equations by separation of the variables: 
(a) (1 + y?)x dx + (1 + x?) dy = 0 
(b) ye?? dx — (1 + e”) dy = 0. 
2. Solve the follwing homogenous equations: 
(a) y2? dx + x(x — y) dy = 0 
(b) xy dx + (x? + y?) dy = 0 
(c) x? — y? + 2xyy = 0 
(d) (x + y) dx + (y — x) dy = 0 
(e) (x2 + xy)y = xy x? — y? + xy + y’. 
3. Show that a differential equation of the form 


=l ax + by +c 
= aix + bıy + cı 


can be reduced to a homogeneous equation as follows. If abı — aıb + 0, 
we take a new unknown function and a new independent variable 


n=ax + by +e, E = ax + biy + cı. 
If abı — aıb = 0, we need only change the unknown function by putting 


l (a, ai, . . . constant) 
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n = ax + by 


to reduce the equation to a new equation in which the variables are 
separated. 
4. Apply the method of the previous exercise to 


(a) (2x + 4y + 3)y =2y4+x4+1 
(b) (8y — 7x + 3)y = 3y — Tx + 7. 
5. Integrate the following linear differential equations of the first order: 


(a) y + y cos x = cos x sin x 


,_ WY _ — ox n 
b) y- ee 


(c) x(x — Dy + (1 — 2x)y + x? = 0 


, 2 
(d) y — y= x 
(e) 1+ x*)y + xy = L., 
1+ x? 
6. Integrate the equation 
y +y’ = a 
7. A Bernoulli equation has the form 


yY +fx)y = g(x)y". 
Show that such an equation is made separable by the substitution 


y =v exp l - [f@) dx} = vF (x). 
8. Integrate the equation 
xy + y(1 — xy) = 0. 
9. By any method available, solve 


y + y sin x + y” sin 2x = 0. 
6.3 Linear Differential Equations of Higher Order 


a. Principle of Superposition. General Solutions 


Many of the examples previously discussed belong to the general 
class of linear differential equations. A differential equation in the 
unknown function u(x) is said to be linear of the nth order if it has 
the form 


(23) u(x) + aux) + + + + + anu(x) = d(x), 
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where a1, a2, a3,..., An are given functions of the independent 
variable x, as is also the right-hand side ¢(x). We denote the ex- 
pression on the left side by L [u] (where L stands for “linear differ- 
ential operator’’). 

If g(x) is identically zero in the interval under consideration, we 
call the equation homogeneous ; otherwise, we call it nonhomogeneous. 
We see at once (as in the special case of the linear differential 
equation of the second order with constant coefficients, discussed 
in Volume I, p. 640) that the following principle of superposition 
holds: 


If ui, ug are any two solutions of the homogeneous equation, every 
linear combination of them, u = cil + czuz, where the coefficients cı, 
c2 are constants, is also a solution. 

If we know a single solution u(x) of the nonhomogeneous equation 
L{u] = ¢(x), we can obtain all other such solutions by adding to 
u(x) any solution of the homogeneous equation. 

For n = 2 and constant coefficients a1, az we proved in. Volume 
I (p. 636) that every solution of the homogeneous equation can be 
expressed in terms of two suitably chosen solutions u1, uz in the form 
ciui + cee. An analogous theorem holds for any homogeneous 
differential equation with arbitrary continuous coefficients. 

To begin with, we explain what we mean by saying that functions 
are linearly dependent or linearly independent, by means of the 
following definition: n functions ¢:1(x), ¢2(x), . . ., n(x) are linearly 
dependent if n constants c1,.. ., Cn that do not all vanish exist, 
such that the equation 


c1di(x) + cago(x) + + + + + cndn(x) = 0 


holds identically, that is, for all values of x in the interval under 
consideration. If, say, cn # 0, then øn (x) may be expressed in the form 


bn(x) = aipi(x) + + © * an-ı $n-1(%), 


and ¢n is said to be linearly dependent on the other functions. If no 
linear relation of the form 


C161(X) + C2Po(x) + + + + + Cndn(x) = 0 
exists, the n functions ¢; (x) are said to be linearly independent. 


1Linear dependence of functions g(x) is defined in exactly the same way as depend- 
ence of vectors (see p. 137). As a matter of fact, it often is convenient to visualize 
a function ¢(x) defined in an interval I of the x-axis as a “vector ¢ with infinitely 
many components,” one component of value ¢(x) corresponding to each x in I. 
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Example 1 

The functions 1, x, x?,. . ., x”-l are linearly independent. Other- 
wise, constants co, C1, . . ., Cn—-1 would have to exist such that the 
polynomial 


Co + c1 x + -+ » -e + en-1x%1 


vanishes for all values of x in a certain interval. This, however, is 
impossible unless all the coefficients of the polynomial are zero. 


Example 2 


The functions e%* are linearly independent, provided ai < az < 
e o o < An. 

PROOF. We assume that this statement has been proved true for 
(n — 1) such exponential functions. Then if 


c1e%17 + c2ee%27 + 2 2 e + Cneln? = Q) 


is an identity in x, we divide by e®* and, putting ai — an = bi, 
obtain 


c1e01% + cgeb2% + » » e + Cn-1 CPn-1% + Cn = Q. 


If we differentiate this equation with respect to x, the constant Cn 
disappears and we have an equation that implies that the (n — 1) 
functions e917, eb2%, . . ., ebn-17 are linearly dependent, from which it 
follows that e%1%, e%%, . . ., e%-1% are linearly dependent, contrary 
to our original assumption. Hence, there cannot be a linear relation 
between the n original functions either. 


Example 3 


The functions sin x, sin 2x, sin 3x,..., sin nx are linearly in- 
dependent in the interval 0 < x < n. We leave the reader to prove 
this in Exercise 1, p. 690, using the fact that 


tro , Oifm+n, 
sin mx sin nx dx = 


(cf. Volume I, p. 274). 


If we assume that the functions ¢; (x) have continuous derivatives 
up to, and including, the nth order, we have the following theorem: 


The necessary and sufficient condition that the system of functions 
ġı(x) shall be linearly dependent is that the equation 


686 Introduction to Calculus and Analysis, Vol. II 


g(x) p(x) . « « nlx) 
(24) w = | 9r) 2'(x) + + n (2) _¢ 


bi ™—W(x) gan-D(x) .. . ul™-D(x) 


shall be an identity in x. The function W is called the Wronskian of 
the system of functions.} 

That the condition is necessary follows immediately: if we assume 
that 


Dice gi(x) = 0, 


successive differentiation gives the further equations 


Dict gi (x) = 0,+ °°, 
dict i) (x) = 0. 


These, however, form a homogeneous system of n equations, which 
are satisfied by the n coefficients c1,..., cn; hence, W, the de- 
terminant of the system of equations, must vanish. 

That the condition is sufficient, that is, that if W = 0 the functions 
are linearly dependent, may be proved as follows: From the vanishing 
of W we may deduce that the system of equations 


C191 + ° + * + Cndn = 0 
cid1’ + e e e + Cndn’ = 0 


cC1ġ1 7) + eee + Cron) = 0 


possesses a solution ci, C2, . . ., Cn that is not trivial (see p. 150) 
where c; may still be a function of x. Here we may assume without 
loss of generality that cn = 1. Further, we may assume that V, the 
Wronskian of the (n — 1) functions gı, ø2, . . ., Øn-1 is not zero, for 
we may suppose that our theorem has already been proved for 
(n — 1)functions; then V = 0implies the existence of a linear relation 


1Jn this proof and the following one a knowledge of the elements of the theory of 
determinants is assumed. Notice that each column of the Wronskian determinant is 
the vector formed from a function ¢ and its derivatives of orders 1, 2, .. ., n — 1. 
Thus, vanishing of the Wronskian for a system of functions means that the corres- 
ponding vectors are dependent (see p. 175). 
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between 1, %2, . . ., dn-1 and, hence, between gı, ¢2, ¢3, . . ., dn. By 
differentiating! the first equation with respect to x and combining 
the result with the second, we obtain 


c1'b1 + Ce’b2 + © © © + Ca_1Gn-1 = 0; 


similarly, by differentiating the second equation and combining the 
result with the third, we obtain 


cigi + czd + 2° + Cn-1' Gn-1' = O, 
and so on, up to 
C112) + cgpa- + e e e + Cn-1'bn-1 = 0. 


Since V, the determinant of these equations, is assumed not to 
vanish, it follows that ci’, c2’, . . ., Cn-1' are zero; that is, ci, C2, . . ., 
Cn-1 are constants. Hence, the equation | 


3 cigi(x) = 0 


does express a linear relation, as was asserted. 
We now state the fundamental theorem on linear differential 
equations: 


Every homogeneous linear differential equation 
(25) L [u] = ao(x) u™(x) + a(x) u”-i(x) + + + + an(x) u(x) = 0 


possesses systems of n linearly independent solutions ui, u2, . . ., Un. 
By superposing these fundamental solutions every other solution u 
may be expressed? as a linear expression with constant coefficients 
Cl, . . «5 Cn: 


1Jt is easy to see that the coefficients c; are continuously differentiable functions of 
x, for if the determinant V is not zero, they can be expressed rationally in terms of 
the functions ¢; and their derivatives. 

2Two different systems of fundamental solutions u1, ..., Un; Ul, ...,Un can be 
transformed into one another by a linear transformation 


n 
Vi = >) Cik Uk, 
k=1 


where the coefficients ciz are constants and form a matrix whose determinant 
does not vanish. 
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n 
U = >) Cit. 

1=1 
In particular, a system of fundamental solutions can be determined 
by the following conditions. At a prescribed point, say x = &, u1 is to 
have the value 1 and all the derivatives of uı up to the (n — 1)-th order 
are to vanish; wu, where i > 1, and all the derivatives of uw; up to the 
(n — 1)-th order, except the i-th, are to vanish, while the i-th derivative 

is to have the value 1. 

The existence of a system of fundamental solutions will follow from 
the existence theorem proved on p. 702. It follows from Wronski’s 
condition (24), which we have just proved, that a linear relation 


must exist between any further solution u and u1, . . ., Un, for the 
equations 
n 
5, ayun = 0 
1=0 
3 anu") = 0 (i=1,...,n) 
=0 
imply that the Wronskian of the (n + 1) functions u, u1, U2, . . ., Un 
must vanish, so that u, u1, u2, . . ., Un are linearly dependent. Since 
U1, . . ., Un are independent, u depends linearly on u1, . . ., Un. 


b. Homogeneous Differential Equations of the Second Order 


We shall consider differential equations of the second order in 
more detail, as they have very important applications. 
Let the differential equation be 


(26) Lu] = au” + bu’ + cu = 0. 


If wi (x), u2 (x) form a system of fundamental solutions, W = uiu? — 
uzu1' is its Wronskian, and W’ = uiu” — uzu”. 
Since 


Liu] = 0 and L{ue] = 0, 
it follows that 
uiL[ue] — ueL[ui] = aW’ + bW = 0. 


This is a first-order linear equation for W. Its general solution by 
formula (22b), p. 681 is given by 
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(27) W = ce ~J (v/a) az 


where c is a constant. This formula is used a great deal in the further 
development of the theory of differential equations of the second 
order. 

Another property worth mentioning is that a linear homogeneous 
differential equation of the second order can always be transformed 
into an equation of the first order, known as Riccati’s differential 
equation. Riccati’s equation is of the form 


vV + pue+qut+r=0, 


where v is a function of x. The linear equation (26) is transformed 
into Riccati’s equation by putting u’ = uz, so that u” = u'z + uz' = 
uz? + uz’, and we have 


az’ + az + bz+c=0. 


A third remark: if we know one solution v(x) of our linear homo- 
geneous differential equation of the second order, the problem can be 
reduced to that of solving a differential equation of the first order and 
can be carried out by quadratures. Specifically, if we assume that 
L{v] = 0 and put u = zv, where 2(x) is the new function that we are 
seeking, we obtain the differential equation 


az"v + 2az'v' + bz'v + zL[v] = avz” + (2av’ + bv) 2’ = 0 


for z. This, however, is a linear homogeneous differential equation 
for the unknown function 2’ = w; its solution is given by formula 
(22d) on p. 681. From w we then obtain the factor z and, hence, the 
solution u by a further quadrature. 

For example, the linear equation of the second order 


n_ gY J _ 
y 2% +255=0 


is equivalent to Riccati’s equation 


et 2-224 % 6 
x x? 


1The same result is obtained by observing that the Wronskian W formed from v and 
any other solution u is given by (27). But, for known W and v the equation W = 


vu’ — v'u represents a linear first-order equation for u that can be solved by quadra- 
tures. 
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where z= y'/|y. The original equation has y = x as a particular 
solution; hence, it may be reduced to the equation of the first order 


v’x = 0, 


where v = y/x. That is, v = ax + b. Hence, the general integral of the 
original equation is given by 


y = ax? + bx. 


We mention that exactly the same method can be used to reduce 
a linear differential equation of the nth order to one of the (n — 1)- 
st order, when one solution of the first equation is known. 


Exercises 6.3b 


1. Prove that the functions sin x, sin 2x, sin 3x, .. . are linearly inde- 
pendent in the interval 0 < x < r. Hint: Any two of these functions are 
orthogonal over the interval; namely, if m + n 


m e . 
sin mx sin nx dx = 0 
0 


(cf. Volume I, p. 274). 
2. Prove that if a1, . . ., ax are different numbers and P(x), . . ., Px(x) are 
arbitrary polynomials (not identically zero), then the functions 


$1(x) = Pi(x)e%”, . . ., x(x) = Pr(x)erk* 
are linearly independent. 
3. Show that the so-called Bernoulli equation (cf. Exercise 7 in Section 6.2) 
y + a(x) y =b(x)y" (n +1) 
reduces to a linear differential equation for the new unknown function 
z = yl", Use this to solve the equations 
(a) xy + y = y? log x 
(b) xy*(xy’ + y) = a? 
(c) (1 — x?)y — xy = axy?. 
4. Show that Riccati’s differential equation 
y = P(x)y? + Q@)y + Rx) = 0 


can be transformed into a linear differential equation if we know a 
particular integral yı = yı(x). [Introduce the new unknown function 
`u = Ty — y). 
Use this to solve the equation 


y — xy + xt—1=0 
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that possesses the particular integral yı = x. 
5. Find the integrals that are common to the two differential equations 


(a) y =y? + 2x — xt 
b) Y= — y? — y + 2x + x? + x4 
6. Integrate the differential equation 
y = y? + 2x — xt 


in terms of definite integrals, using the particular integral found in 
Exercise 5. Draw a rough graph of the integral curves of the equation 
throughout the x, y-plane. 

7. Let yı, y2, y3, ys be four solutions of Riccati’s equation (cf. Exercise 4). 
Prove that the expression 


(yı — y3) 
(yı — ya) 
(y2 — y3) 
(y2 — y4) 


is a constant. 


8. Show that if two solutions, yı(x) and y(x), of Riccati’s equation are 
known, then the general solution is given by 


y — yı = c(y — y2) exp [SP(y2 — yı) dx], 


where c is an arbitrary constant. 
Hence find the general solution of 


/ 
— y tan x = y? cos x — ——— 
Y =I y cos x’ 


which has solutions of the form a cos” x. 
9. Prove that the equations 
(a) a — x)y” + xy —y=0 
(b) 2x(2x — 1)y” — (4x2 + Dy + yx +1)=0 


have a common solution. Find it and hence, integrate both equations 
completely. 


10. The tangent at a point P of a curve cuts the axis of y at a point T below 
the origin O and the curve is such that OP = n» OT. Prove that its polar 
equation is of the form 


__ (1+ sin 9)" 
=a cos™i o ` 


c. The Nonhomogeneous Differential Equation. Method of Variation 
of Parameters 
To solve the nonhomogeneous differential equation 


(28a) L[u] = aou™ + - + + + anu = g(x) 
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in general, it is sufficient, by what we have said on p. 684, to find a 
single solution. This may be done as follows: By proper choice of the 
constants C1, C2, . . ., Cn, we first determine a solution of the homo- 
geneous equation L[u] = 0 in such a way that the equations 


(28b) u(&) =0, w'(S) = 0, . . ., u") = 0, w™—"N(E) = 1 


are satisfied. This solution, which depends on the parameter £, we 
denote by u(x, &). The function u(x, €) is a continuous function of 
€ for fixed values of x, and so are its first n derivatives with respect 
to x. As an example, for the differential equation u” + k? u = 0 the 
solution u(x, €) that fulfills the conditions (28b) has the form 
[sin k(x — &)]/k. 

We now assert that the formula 


(28c) u(x) = f° AE) u(x, &) dé 


gives a solution of L[v] = ¢ that, together with its first n — 1 
derivatives, vanishes at the point x = 0. To verify this statement,! 
we differentiate the function u(x) repeatedly with respect to x by the 
rule for the differentiation of an integral with respect to a parameter 
[cf. (41) p. 77] and recall the relations following from (28b): 


u(x, x) = 0, u(x,x) =0,.. .,u™ (x, x) = 0, u(x, x) = 1 
where, for example, u'(x, x) = du(x, &)/dx for — = x. 
We thus obtain 


u'(x) = $E) u(x, 5) lea + Í Í (E) u(x, §) dë = f Í (E) u'(x, &) a6, 


v = $Oule Dee t S OEE d= f * IE) u(x, £) dé, 


vin-D(x) = P(E) u(x, §) len + f, Í ØE) w(x, E) dE 


1]t is possible to give a physical interpretation for this process. If x = ¢ denotes the 
time and u the coordinate of a point moving on a straight line subject to a force ¢(x), 
the effect of this force may be thought of as arising from the superposition of the 
small effects of small impulses. The above solution u(x, €) then corresponds to an im- 
pulse of amount 1 at time €, and our solution gives the effect of impulses of amount 
(E) during the time between 0 and x. 
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= f SE wo (x, 8) ab, 
vmx) = SEUD (x, Eles + | HG) ua, E) dé 
= Hx) + S 6) winx, 8) d, 


Since L[u(x, €)] = 0, this establishes the equation L[v] = ¢(x) and 
shows that the initial conditions v(0) = 0, v'(0) = 0,. . . , v0) = 0 
are satisfied. 

The same solution can also be obtained by the following apparently 
different method, which generalizes the procedure used on p. 681 
for a first-order equation. We seek a solution u of the nonhomo- 
geneous equation in the form of a linear combination of independent 
solutions wù; of the homogeneous equation 


(28d) u = D1 y(x) u(x), 


where now we allow the coefficients y: to be functions of x. On these 
functions, we impose the following conditions: 


YU + Yo'U2 + + °° + Yn'Un = 0 
Yru + YU? + +++ + Yn'Ur = 0 


Yr U2) + y'u? + e e e + Yn'Un"? = 0. 


From these it follows that the derivatives of u are given by the fol- 
lowing formulae: 


un- = X yru" 
Um = Syy ud + Dy. 


Substituting these expressions in the differential equation and re- 
membering that L[u] = ¢, we have 


Ds y u-d = g(x). 
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For the coefficients y;' we obtain a linear system of equations, with 
determinant W, the Wronskian of the system of fundamental solutions 
ui, which therefore does not vanish. Thus, the coefficients y;’ are 
determined, and hence, by quadratures; so are the coefficients yi. 
As the whole argument can be reversed, a solution of the equation has 
actually been found, which, in fact, is the general solution, by virtue 
of the integration constants concealed in the coefficients ¥:. 

We leave it to the reader to show that the two methods are really 
identical, by expressing u(x, £), the solution of the homogeneous 
equation defined above, in the form 


u(x, E) = >) as(§)ui(x). 


The latter method is known as variation of parameters, because it 
exhibits the solution as a linear combination of functions with 
variable coefficients, whereas in the case of the homogeneous equation 
these coefficients were constants. 


Example 1 
We consider the equation 


u’ u 
u” — 2 — + 2 -3 = xe". 
x x 


By p. 690, a system of independent solutions of the corresponding 
homogeneous equation 


is given by wi = x, uz =x?. Hence, if we seek solutions of the form 
= yıx + y2x?, 
we have the conditions 
yx + yo2'x? = 0, 
yy + 2ye'x = xe? 
for yı and y2. That is, 
yl = —xe?, Yo’ = ef. 


Hence, the general solution of the original nonhomogeneous 
equation is 
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u = xe? + cix + Cox. 


Example 2 


As an application we give a method for dealing with forced vibra- 
tions, for which the right side of the differential equation need no 
longer be periodic, as in the cases considered in Volume I, Chapter 
9, p. 641, but may instead be an arbitrary continuous function f(t). 
For the sake of simplicity we restrict ourselves to the frictionless case 
and take m = 1 (or, what amounts to the same thing, divide through 
by m). Accordingly, we write the differential equation in the form 


(28e) X(t) + k'x(t) = g(t), 


where the quantity k? and ¢ are what we called k and f before. 
According to (28c), the function 


Fi) = J- f B(A) sin k(t — À) dì 


is a solution of the differential equation (28e) and satisfies the initial 
conditions 


F0)=0, F(0)=0. 


For the general solution of the differential equation we thus obtain, 
just as before, the function 


t 
x(t) = if (A) sin k(t — À) dA + cı sin kt + c2 cos kt, 


where cı and cz are arbitrary constants of integration. 

In particular, if the function on the right side of the differential 
equation is a periodic function of the form sin wt or cos wt, a simple 
calculation again yields the results of Volume I, Chapter 9, p. 642. 


Exercises 6.3c 


1. Integrate the following equations: 
(a) y” —y=0. 
(b) y” — 4y” + By’ — 2y = 0. 
(ce) y” — 3y” + 38y —y=0 
(d) y” — 3y” + 2y=0 
(e) xy” + xy — y= 0. 
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2. Prove that the linear homogeneous equation 
L(y) = yim) -+ ciy") + eee + Cn-1yY + Cn = 0 


with constant coefficients c has a system of fundamental solutions of the 
form xHe%:*, where the a,’ s are the roots of the polynomial 


f(z) = z” + ciz™14 o e e +n, 
3. Let 
aoy + ary’ +*** + any™ = P(x) 


be a linear nonhomogeneous differential equation of the nth order with 

constant coefficients, and let P(x) be a polynomial. Let ao # 0 and con- 

sider the formal identity 
ee Se 
ao + ait +++*+ ant” 


Prove that 


= bo + bit + bet? + °» «, 


y = boP(x) + bıP'(x) + bP” (x) + + 


is a particular integral of the differential equation. 
If ao = 0, but aı # 0, then the expansion 


1 
ait + azt? + » ++ + ant” 


is possible. Prove that now 


y=b f P(x) dx + boP(x) + b1P’(x) + baP”(x) + oes 


= bt-1 + bo + bit + bet? +.» 


is a particular integral of the differential equation. 
4. Apply the method of Exercise 3 to find particular integrals of 


(a) y” + y = 3x? — 5x 


(b) y” +y =(1+ x) 
5. A particular integral of the equation 


aoy + ary’ + 22+ + any™® = e*tP(x), 


where k, ao, ai,...are real constants and P(x) is a polynomial, can 
be found by introducing a new unknown function z = 2(x) given by 


y = zekz 


and applying the method of Exercise 3 to the equation in z. 
Use this method to find particular integrals of 


(a) y” + 4y + 3y = 3e? 
(b) y” — 2y + y = xef. 
6. Integrate the equation 
y” — By’ + 6y = e*(x? — 3) 
completely. 
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7. (a) If u, v are two independent solutions of the equation 
Fy” — f(x)” + o(x)y’ + Ax)y = 0, 
prove that the complete solution is Au + Bu + Cw, where 
u 
et Ga way TY wae ay 
and A, B, C are arbitrary constants. 
(b) Solve the equation 


x2(x2 + 5)y” — x(Tx? + 25)y” + (22x2 + 40)’. — 30xy = 0 


that has solutions of the form x”. 
6.4 General Differential Equations of the First Order 


a. Geometrical Interpretation 


We begin by considering a differential equation of the first order 
(29) F(x, y, y’) = 0, 


where we assume that the function F is a continuously differentiable 
function of its three arguments x, y, y’. Geometrically at a point in the 
plane with rectangular coordinates (x, y), the equation is a condition 
on the direction of the tangent to any curve y(x) passing through 
this point that satisfies the differential equation. We assume that in 
a certain region R of a plane, say in a rectangle, the differential equa- 
tion F(x, y, y’) = 0 can be solved uniquely for y’ and, thus, can be 
expressed in the form 


(30) y = f(x, y), 


where the function f(x, y) is continuously differentiable in x and y. 
Then to each point (x, y) of R equation (30) assigns a direction of 
advance. The differential equation is therefore represented geometri- 
cally by a field of directions ; and the problem of solving the differential 
equation geometrically consists in the finding of those curves that 
belong to this field of directions, that is, those whose tangents at 
every point have the direction preassigned by the equation y’ = 
f(x, y). We call these curves the integral curves of the differential 
equation. 

It is now intuitively plausible that through each point (x, y) of R 
there passes a single integral curve of the differential equation y’ = 
f(x, y). These facts are stated more precisely in the following funda- 
mental existence theorem: 
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If in the differential equation y' = f (x, y) the function f is continuous 
and has a continuous derivative with respect to y in a region R, then 
through each point (xo, yo) of R there passes one, and only one, integral 
curve; that is, there exists in a neighborhood of xo one, and only one, 
solution y(x) of the differential equation for which y(xo) = yo. 

We shall return to the proof of this theorem on p. 702 Here we 
confine ourselves to the consideration of some examples. 

For the differential equation 


3la t=% 
(31a) y y’ 


that we consider in the region y < 0, say, the field at a point (x, y) 
is readily seen to have a direction perpendicular to the vector from 
the origin to the point (x, y). From this we infer by geometry that the 
circular arcs about the origin must be the integral curves of the dif- 
ferential equation. This result is very easily verified analytically, 
for by the method of separation of variables (p. 679), it follows that 


x? + y? = constant = c, 


which shows that these circles are the solutions of the differential 
equation. 
At each point, the field of directions of the differential equation 


A 

(31b) y=". 
obviously has the direction of the line joining that point to the 
origin. Thus, the lines through the origin belong to this field of 
directions and are therefore integral curves. As a matter of fact, we 
see at once that the function y = cx satisfies the differential equation 
for any arbitrary constant c.t 

In the same way, we can verify analytically that the differential 
equation 


xX 
1 2 +0 
y=y (y # 0) 


and 


=J 
y=-7 (x + 0) 


1At the origin the field of directions is no longer uniquely defined; this is connected 
with the fact that an infinite number of integral curves pass through this singular 
point of the differential equation. 
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are satisfied by the respective families of hyperbolas 


yr=aetx 


=- 
y — x ’ 
where c is the parameter specifying the particular curve of the family. 

Our fundamental theorem shows that, in general, differential 
equations of the first order are satisfied by a one-parameter family 
of functions. Functions of x in such a family depend not only on x but 
also on a parameter c, for example, on c = yo = y(0); as we say, the 
solutions depend on an arbitrary constant of integration. Ordinary in- 
tegration of a function f(x) is merely the special case of the solution 
of the differential equation in which f(x, y) does not involve y. The 
direction of the field at a point is then determined by the x-coordinate 
alone, and we see at once that the integral curves are obtained from 
one another by translation in the direction of the y-axis. Analytically, 
this corresponds to the familiar fact that the indefinite integral y, 
that is, the solution of the differential equation y’ = f(x), involves 
an arbitrary additive constant c. 

The geometrical interpretation of the differential equation sug- 
gests an approximate graphical construction of the integral curves, 
in much the same way as in the special case of the indefinite integra- 
tion of a function of x (Volume I, p. 483). We have only to think of the 
integral curve as replaced by a polygon in which each side has the 
direction assigned by the field of directions for its initial point (or for 
any other one of its points). Such a polygon can be constructed by 
starting from an arbitrary point in R. The smaller we take the length 
of the sides of the polygon, the greater the accuracy with which the 
sides of the polygon will agree with the field of directions of the dif- 
ferential equation, not only at their initial points but throughout 
their whole length. Without going into the proof, we here state the 
fact that, by successively diminishing the length of the sides, a poly- 
gon constructed in this way may actually be made to approach closer 
and closer to the integral curve through the initial point. 


b. The Differential Equation of a Family of Curves. Singular 
Solutions. Orthogonal Trajectories 


The existence theorem shows that every differential equation has 
a family of integral curves. This suggests that we ask the reverse 
question. Does every one-parameter family of curves ¢(x, y, c) = 0 
or y = g(c, x) have a corresponding differential equation 
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that is satisfied by all the curves of the family? If so, how can we find 
this differential equation? Here the essential point is that c, the 
parameter of the family of curves, does not occur in the differential 
equation, so that the differential equation is in a sense a representa- 
tion of the family of curves not involving a parameter. In fact, it is easy 
to find such a differential equation. Differentiating with respect to 
x, in 


(32a) (x, y, c) = 0 
we have 
(32b) dz + pyy = 0. 


If we eliminate the parameter c between this equation and the 
equation ¢ = 0, the result is the desired differential equation. This 
elimination is always possible for a region of the plane in which the 
equation ¢ = 0 can be solved for the parameter c in terms of x and 
y. We then have only to substitute the expression c = c(x, y) thus found 
in the expressions for ¢z and øy in order to obtain a differential 
equation for the family of curves. 

As a first example, we consider the family of concentric circles 
x2 + y? — c = 0, from which, by differentiation with respect to x, 
we obtain the differential equation 


(32c) x + yy = 0, 


in agreement with (31a), p. 698. 

Another example is the family (x — c)? + y? = 1 of circles with 
unit radius and center on the x-axis. By differentiation with respect 
to x, we obtain 


(x — c) + yy’ = 0, 
and on eliminating c, we obtain the differential equation 
PA +y?) = 1. 


The family y = (x — c)? of parabolas touching the x-axis likewise 
leads by way of the equation y’ = 2(x — c) to the required differential 
equation 


y” = Ay. 
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In the last two examples we see that the corresponding differential 
equations are satisfied not only by the curves of the family but, in the 
first case, also by the lines y = 1 and y = —1 and, in the second case, 
also by the x-axis, y = 0. These facts, which can at once be verified 
analytically, also follow without calculation from the geometrical 
meaning of the differential equation. For these lines are the envelopes 
of the corresponding families of curves, and since the envelopes at 
each point touch a curve of the family, they must at that point have 
the direction prescribed by the field of directions. Therefore, every 
envelope of a family of integral curves must itself satisfy the differ- 
ential equation. Solutions of the differential equation that are found 
by forming the envelope of a one-parameter family of integral curves 
are called singular solutions. 

Let R be a region that is simply covered by a one-parameter family 
of curves P(x, y) = c = constant. If to each point P of R we assign 
the direction of the tangent of the curve passing through P, we obtain 
a field of directions defined by the differential equation y’ = —®,/®, 
[see (32b)]. If, on the other hand, to each point P we assign the direc- 
tion of the normal to the curve passing through it, the resulting field 
of directions is defined by the differential equation 


The solutions of this differential equation are called the orthogonal 
trajectories of the original family of curves ®(x, y) = c. The curves 
® = c (the level lines of the function ®) and their orthogonal trajec- 
tories intersect everywhere at right angles. Hence, if a family of 
curves is given by the differential equation y’ = f(x, y), we can find 
the differential equation of the orthogonal trajectories without in- 
tegrating the given differential equation, for the equation of the 
orthogonal trajectories is 


r 1 
l= Rey 


In the example (81a) discussed above, from the differential equation 
satisfied by the circles x? + y2? = c we find that the differential 
equation of the orthogonal trajectories is y’ = y/x. The orthogonal 
trajectories are therefore straight lines through the origin [see (31b)]. 

If p > 0, the family of confocal parabolas (cf. Chapter 3, p. 234) 
y? — 2p(x + p/2) = 0 satisfies the differential equation 
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, 1 
y= (—x + Vx + y2). 


Hence, the differential equation of the orthogonal trajectories of this 
family is 


, —1 


y= (—x + vx + y2)/y 


=> (— x — Vx? + y?), 


The solutions of this differential equation are the parabolas 
y? — 2p(x + p/2) = 0, 


where p < 0; these are parabolas confocal with one another and with 
the curves of the first family. 


c. Theorem of the Existence and Uniqueness of the Solution 


We now prove the theorem of the existence and uniqueness of the 
solution of the differential equation y’ = f(x, y) that we stated on 
p. 698. Without loss of generality, we can assume that for the solu- 
tion y(x)in question the initial condition f(xo) = yo reduces to y(0) = 
0, for we could introduce y — yo = n and x — xo = € as new variables 
and should then obtain a new differential equation, dn/dé = 
f(E + xo, n + yo), of the same type, satisfying the desired condition. 

In the proof, we may confine ourselves to a sufficiently small neigh- 
borhood of the point x = 0. If we have proved the existence and 
uniqueness of the solution for such an interval about the point 
x = 0, we can then prove the existence and uniqueness for a neighbor- 
hood of one of its end points, and so on. 

Let us then consider a rectangle |x| <a, |y| < b contained in 
the domain of the function f(x, y). There exist bounds M, Mi such 
that 


(32d) fux, yS M, |f, y)| < Mı for |x| Sa, |y| S b. 


Replacing, if necessary, a by a smaller positive value, we can always 
bring about that 


(32e) Mı a< b, Ma <1. 


The inequalities (32d) will still be valid in the smaller rectangle. For 
any solution y(x) of y’ = f(x, y) with initial value y(0) = 0 we then 
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have the estimate |y(x)| < b for |x| < a. For otherwise there would 
exist values € for which |&| < a, |y(&)| = b. There would be such a € 
of smallest absolute value. Then the relation 


< Mı |E] S$ Ma<b 


b= ®I =| f * (x, (£) dx 


would lead to a contradiction. 

We first convince ourselves that there cannot be more than one 
solution of the differential equation satisfying the initial conditions, 
for if there were two solutions y1(x) and ye(x), the difference d(x) = 
yı — y2 would satisfy 


d'(x) = f(x, y(x) — fx, ye(x)). 


By the mean value theorem, the right side of this equation can be 
put in the form (yı — y2) f(x, Y) = d(x) f,(x, Y), where y is a value 
intermediate between yı and yz. In a neighborhood |x| < a of the 
origin, yı and ye are continuous functions of x that vanish at x = 0. 
Here bis an upper bound of the absolute values of the two functions 
in this neighborhood, so that |y| < b whenever |x| < a. Further- 
more, M is a bound of |fy| in the region |x| <a, |y| < b. Finally, 
let D be the greatest value of |d(x)| in the interval |x| < a and sup- 
pose that this value is assumed at x = €. Then, for |x| < a, 


|d’(x)| = | d(x) fix, ¥)| < DM, 


and therefore, 


D = |d(é)| =| f d'(x) dx| < |E] DM < aDM. 


But since a M < 1, it follows that D = 0. That is, in such an interval 
|x| < a we have! yi(x) = y(x). 

By a similar integral estimate we arrive at a proof of the existence 
for the solution. We construct the solution by a method that has other 
important applications, in particular, to the numerical solution of 
differential equations and to the inversion of mappings (see p. 266). 
This is the process of iteration or successive approximations. Here we 


1The root idea of this proof is the fact that for bounded integrands integration gives 
a quantity that vanishes to the same order as the interval of integration, as that 
interval tends to zero. 
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obtain the solution as the limit function of a sequence of approximate 
solutions yo(x), y1(x), y2(x), . . .. As a first approximation yo(x), we 
take yo(x) = 0. Using the differential equation, we take 


n= f TE, 0) dé 


as the second approximation: from this we obtain the next approxi- 
mation y2(x), 


ad= f TE, yE) d, 


and in general the (n + 1)-th approximation is obtained from the 
n-th by the equation 


(33a) ya(a) = f FE, yn-1(8) dë. 


If in an interval |x| < a these approximating functions converge 
uniformly to a limit function y(x), we can at once perform the pas- 
sage to the limit under the integral sign and obtain for the limit 
function the equation 


(33b) w(x) = f TE, YE) dé. 


From this it follows by differentiation that y’ = f(x, y), so that y 
is actually the required solution. 

We prove convergence for a sufficiently small interval |x| <a 
by means of the following estimate. We put yn+i(x) — yn(x) = dn(x) 
and by Dn denote the maximum of |d,(x)| in the interval |x| < a. 

From the equation 


An'(x) = ynt1' — Yn’ = f(x, Yn) — F(X, Yn-1) 
the mean value theorem gives 
(33c) An'(x) = dn-1(x) fy(x, n-1(x)), 


where yn-1 is a value intermediate between yn and yn-1. Let the in- 
equalities | f,(x, y)| = M, | f(x, y)| < Mı hold in the rectangular 
region |x| <a, |y| < b. If we assume that for the function yn the re- 
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lation |yn| < b holds in the interval |x| < a, then, by the definition 
of yn+1, we have 


Lynea(x)| = | [Fe yale a, < |x| Mı £ aM. 


We shall therefore choose the bound a for x so small that aM < b. 
Then, in the interval |x| < a, we shall certainly have |Yn+ı(x)| < b. 
Since for yo(x) = 0 it is obvious that |yo| < b, it follows by induction 
that in the interval |x| < a we have |yn(x)| < b for every n. Hence, 
in (383c) we may use the estimate |fy| < M and integrate to obtain 


Ida(x)| = [ave d < JÉ midns@ias |, 


Thus, we may bound the maximum Dn of |dz(x)| in the interval 
|x| S aby 


Dn < aMDn-1. 


We now take a so small that aM < q < 1, where q is a fixed proper 
fraction, say q = 4. Then Daii < qDn < q” Do. 
Let us now consider the series 


do(x) + di(x) + d(x) + eee + An-1(x) + eee. 


The nth partial sum of this series is yn(x). The absolute value of the 
nth term is not greater than the number Dog” when |x| < a. Our 
series is therefore dominated by a convergent geometric series with 
constant terms. Hence (cf. Volume I, p. 535), it converges uniformly 
in the interval |x| < ato a limit function y(x), and thus, we see that 
an interval |x| < a exists in which the differential equation has a 
unique solution. 

All that now remains to be shown is that this solution can be ex- 
tended step by step until it reaches the boundary of the (closed 
bounded) region R in which we assume f(x, y) to be defined. The proof 
so far shows that if the solution has been extended to a certain point, 
it can be continued onward over an x-interval of length a, where 
a, however, depends on the coordinates (x, y) of the end point of the 
portion already constructed. It might be imagined that in this advance 
a diminishes from step to step so rapidly that the solution cannot be 
extended by more than a small amount, no matter how many steps 
are made. This, as we shall show, is not the case. 
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Suppose that R’ is a closed bounded region interior to R. Then 
we can find a number b so small that for very point (xo, yo) in R’ the 
whole square x% — b < x < xo + b, yo— b < y < yo + b hes in R. 
If by M and Mı we denote the upper bounds of | fy(x, y)| and |f(x, y)| 
in the region R, then we find that in the preceding proof all the condi- 
tions imposed on a are certainly satisfied if we take a to be, say, the 
smallest of the numbers b, M/2, and b/Mı. This no longer depends 
on (xo, yo); hence, at each step we can advance by an amount a that is 
a constant. Thus, we can proceed step by step until we reach the 
boundary of R’. Since R’ can be chosen as any closed region in R, 
we see that the solution can be extended to the boundary of R.1 


Exercises 6.4 


1. Let 


f(x, y,c) = 0 


be a family of plane curves. By eliminating the constant c between this 
and the equation 


af , of _ 
Ox + ay> = 0, 
we get the differential equation 
F(x, y,¥) =0 


of the family of curves (cf. p. 700). Now let ¢(p) be a given function of p; 
a curve C satisfying the differential equation 


F(x, y, $(y’)) = 0 


is called a trajectory of the family of curves f(x, y, c) = 0. The second and 
third equations show that 


y = $Y’) 
is the relation between the slope Y’ of C at any given point, and the slope 


1Jt is essential in this theoremthat R be a closed and bounded region and not, for ex- 
ample, the whole x, y-plane. This is shown by the differential equation 
y=1l+y 

for which f(x, y) is defined and continuously differentiable for all x, y. The unique so- 
lution of this equation with initial condition y = 0 for x = Ois the function y = tan x 
for |x| < n/2. The solution ceases to exist at x = +r/2, in spite of the fact that f(x, y) 
is regular for all x and y. In agreement with the general theorem proved, the graph 
of the solution leaves any prescribed bounded and closed subset of R, for example, 
any rectangle |x|< a, |y| = b, before ceasing to exist. The function y = tan x either 
exists in the whole interval |x| <a or exists and becomes larger than b in absolute 
value in some subinterval. 
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y’ of the curve f(x, y, c) = 0 passing through this point. The most impor- 
tant case is ¢(p) = — 1/p, leading to the equation 


1 
Pls, „= =) =0, 
7 yY 


which is the differential equation of the orthogonal trajectories of the 
family of curves (cf. p. 701). 

Use this method to find the orthogonal trajectories of the following 
families of curves: 


(a) x2 +y? +cy—1=0 
(b) y = cx? 

2 2 
(c) Fe + F c 
(d) y = cosx + c 
(e) (œ — c) +y? = a. 
In each case draw the graphs of the two orthogonal families of curves. 


2. For the family of lines y = cx, find the two families of trajectories in 
which (a) the slope of the trajectory is twice as large as the slope of the 
line; (b) the slope of the trajectory is equal and of opposite sign to the 
slope of the line. 


3. Differential equations of the type 


= 1, (a > b > 0, —b2 < c <oo) 


y = xp + Xp), p=y 
were first investigated by Clairaut. Differentiating, we get 


[x + W(p)] ae —0, 
x 


which gives p = c = constant, so that 
y = xe + ¥(c) 


is the general integral of the differential equation; it represents a family 
of straight lines. Another solution is 


x = — (p), 
which together with 
y = — pý (p) + $p) 


gives a parametric representation of the so-called singular integral. 
Note that the curve given by the last two equations is the envelope of the 
family of lines. 

Use this method to find the singular solution of the equations 
p? 
(a) y=xp- 7 


(b) y = xp + æ. 
4. Find the differential equation of the tangents to the catenary 
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x 
y =a cosh x 


5. Lagrange investigated the most general differential equation linear 
in both x and y, namely, 


y = x$(p) + $p). 


Differentiating, we get 


p = $(p) + (xo) + V) P 


which is equivalent to the linear differential equation 
dx $'(p) p’(p) 
— + + = 0,7 
dp ¢(p)—p $(p)—P 


provided ¢(p) — p + 0 and p is not constant. Integrating and using the 
first equation, we get a parametric representation of the general in- 
tegral. From the second equation we see that the equations ¢(p) — p = 0, 
p = constant lead to a certain number of singular solutions represent- 
ing straight lines. 

The solutions can be interpreted geometrically as follows: Consider 
the Clairaut equation 


y = xp + [¥(¢-(p)], 


where ¢~'(p) is the inverse function of ¢(p), thatis, $-1(¢(p)) = p. From this 
we see that the solutions of the differential equation are a family of tra- 
jectories of the family of straight lines 


y = xc + Yd-(c)] 
or 
y = x(c) + $c) (c = constant). 


Thus, for example, 
x 
— —. + 
yas p(p) 


is the differential equation of the involutes (orthogonal trajectories of 
the tangents) of the curve that represents the singular integral of the 
Clairaut equation 


y=xp + y|- 5). 


Use this method to integrate the equation 
y= x(p + a) — + (p + a). 


6. Express, when possible, the integrals of the following differential equa- 
tions by elementary functions: 
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© [asl = © las) = My" 
oils ons 


In each case, draw a graph of the family of integral curves, and detect the 
singular solutions if any, from the figures. 


7. Integrate the homogeneous equation 


2 2 
[ay — y| = x2 — z? faro sin A 


and find the singular solutions. 


8. As mentioned in Exercise 3, a curve is the envelope ofits tangents, hence, 
it is the singular integral of the Clairaut equation satisfied by its tangent 
lines. With this in mind, ascertain what kind of curve satisfies each of 
the following properties and give the corresponding Clairaut equation: 
(a) The sum of the x- and y-intercepts of a tangent line is constant. 
(b) The length of the segment intercepted on a tangent by the axes is 

constant. 
(c) The area bounded by the tangent line and the axes is constant. 


6.5. Systems of Differential Equations and Differential 
Equations of Higher Order 


The above arguments extend to systems of differential equations 
of the first order with as many unknown functions of x as there are 
equations. As an example of sufficient generality, we shall consider 
here the system of two differential equations for two functions y(x) 
and z(x), 


y = f(x, y, 2), 
z' = g(x, y, 2), 


where the functions f and g are continuously differentiable. This 
system of differential equations can be interpreted by a field of direc- 
tions in x, y, 2-space. To the point (x, y, z) of space a direction is as- 
signed whose direction cosines are in the proportion dx: dy: dz = 
1: f: g. The problem of integrating the differential equation again 
amounts geometrically to finding curves in space that belong to this 
field of directions. As in the case of a single differential equation, we 
again have the fundamental theorem that through every point (xo, 
yo, Zo) of a region R in which the given functions f and g are con- 
tinuously differentiable, there passes one, and only one, integral curve 
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of the system of differential equations.! The region R is covered by 
a two-parameter family of curves in space. These give the solutions 
of the system of differential equations as two functions y(x) and 2(x) 
that both depend on the independent variable x and also on two arbi- 
trary parameters cı and ce, the constants of integration. 

Systems of differential equations of the first order are particularly 
important because differential equations of higher order, that is, 
differential equations in which derivatives higher than the first occur, 
can always be reduced to such systems. 

For example, the differential equation of the second order 


y” = h(x, y, y’) 


can be written as a system of two differential equations of the first 
order. We have only to take the first derivative of y with respect to 
x as a new unknown function z and then write down the system of 
differential equations 


/ = Z, 
2’ = h(x, y, 2). 
This is exactly equivalent to the given differential equation of the 
second order, in the sense that every solution of the one problem is 
at the same time a solution of the other. 
The reader may use this as a starting point for the discussion of the 
linear differential equation of the second order and thus prove the 


fundamental existence theorem for linear differential equations used 
on p. 687. 


Exercises 6.5 


1. Solve the following differential equations: 
(a) yy” =x 
(b) 2y” y” — 1 


1For xo = yo = Zo = 0 the proof again can be given by a suitable iteration scheme 
with the recursion formulae 


Yn) = | FE, yn), 2n(€)) dE, 


Zn+1(x) = f ” gle, Yn), 2a(E)) dE 


taking the place of the single relation (33a). 
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(c) xy” — y =2 
(d) 2xy” y” = y”? — 2 
2. A differential equation of the form 
fy, y, y) =0 


(note that x does not occur explicitly) may be reduced to an equation of 
the first order as follows: Choose y as the independent variable and p = 
y’ as the unknown function. Then 


y =P, y “dx dydx ?? 


and the differential equation becomes f(y, p, pp’) = 0. 
Use this method to solve the following equations. 


(a) 2yy” + yy? =0 

(b) yy’ + y?—-1=0 
(c) y3y” =1 

(d) y” — y? + yy =0 
(e) y”? = y” y”? 

(f) ye +y” = 0. 


3. Use the method of Exercise 2 to solve the following problem: At a 
variable point M of a plane curve T draw the normal to T; mark on this 
normal the point N where the normal meets the x-axis and C, the center 
of curvature of T at M. Find the curves such that 


MN - MC = constant = k. 
Discuss the various possible cases for k > 0 and k < 0, and draw the 
graphs. 
4. Find the differential equation of the third order satisfied by all circles 


x? + y2 + 2ax + 2by + c = 0. 


6.6 Integration by the Method of Undetermined Coefficients 


In conclusion, we mention yet another general device that can 
frequently be applied to the integration of differential equations. 
This is the method of integration in terms of power series. We assume 
that in the differential equation 


y = f(x, y) 


the function f(x, y) can be expanded as a power series in the variables 
x and y and accordingly possesses derivatives of any order with 
respect to x and y. We can then attempt to find the solutions of the 
differential equation in the form of a power series 
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and to determine the coefficients of this power series by means of the 
differential equation.! To do this we proceed by forming the differ- 
entiated series 


y = cı + 2cox + 3c3x? + + + e, 


replacing y in the power series for f(x, y) by its expression as a power 
series, and then equating the coefficients of like powers of x on the 
right and on the left (method of undetermined coefficients). Then, if 
Co = c is given any arbitrary value, we can attempt to determine the 
coefficients 


C1, C2, C3, C4, . . . 


successively. 

The following process, however, is often simpler and more elegant. 
We assume that we are seeking that solution of the differential 
equation for which y(0) = 0, thatis, for which the integral curve passes 
through the origin. Then co = c = 0. If we recall that by Taylor’s 
theorem the coefficients of the power series are given by the expres- 
sions 


l1 
Cv = TDA (0), 


we can calculate them easily. In the first place, cı = y’(0) = f(0, 0). 
To obtain the second coefficient c2 we differentiate both sides of the 
differential equation with respect to x and obtain 


y(x) = fe + fy y'. 


If we here substitute x = 0 and the already known values y(0) = 0 
and y’(0) = f(0, 0), we obtain the value y”(0) = 2ce. In the same 
way, we can continue the process and determine the other coefficients 
C3, C4, . . ., One after the other. 

It can be shown that this process always gives a solution if the 
power series for f(x, y) converges absolutely in the interior of a 
circle about x = 0, y = 0. We shall not give the proof here. 


1The first few terms of the series then form a polynomial of approximation to the 
solution. 


Differential Equations 713 


Exercises 6.6 


1. Obtain the power series expansions to the indicated number of terms for 
the solution passing through the given point of each of the following 
differential equations. | 


(a) y = x + y, k terms, (0, a) 

(b) y = sin (x + y), four terms, (0, 7/2) 
(c) y = e7, four terms, (0, 0) 

(d) y = Vx? + y?, four terms, (0,-1). 


2. Solve the differential equation 
H 1 / 
y + zy tI= 0, 


with y(0) = 1, y’(0) = 0, by means of a power series. Prove that this func- 
tion is identical with the Bessel function Jo(x) defined in Section 4.12, 
Exercise 7, p. 475. 


6.7 The Potential of Attracting Charges and Laplace’s 
Equation 


Differential equations for functions of a single independent varia- 
ble, such as we have discussed above, are usually called ordinary 
differential equations, to indicate that they involve only “ordinary” 
derivatives, those of functions of one independent variable. In many 
branches of analysis and its applications, however, an important 
part is played by partial differential equations for the function of 
several variables, that is, equations between the variables and the 
partial derivatives of the unknown function. Here we shall touch 
upon some typical applications that involve Laplace’s differential 
equation. 

We have already considered the field of force produced by masses 
according to Newton’s law of attraction, and we have represented it 
as the gradient of a potential ® (cf. Chapter 4, pp. 489 ff.). In this 
section we shall study the potential in somewhat greater detail.1 


a. Potentials of Mass Distributions 


As an extension of the cases considered previously, we now take 
m as a positive or negative mass or charge. Negative masses do not 
enter into the ordinary Newtonian law of attraction, but in the theory 


1An extensive literature is devoted to this important branch of analysis (see, e.g., 
O.D. Kellogg Foundations of Potential Theory Frederick Ungar Publ. Co.). 
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of electricity, where mass is replaced by electric charge, we dis- 
tinguish between positive and negative electricity; there, Coulomb’s 
law of attracting charges has the same form as the law of gravitational 
attraction of masses. If a charge m is concentrated at a single point of 
space with coordinates (E, n, ¢), we call the expression m/r, where 


r= v(x — 6) + (y — nÈ + (z — 0, 


the potential! of this mass at the point (x, y, z). By adding up a number 
of such potentials for different sources or poles (&, ni, Gi), we obtain 
as before (cf. p. 439) the potential of a system of particles or point 
charges 


The corresponding fields of force are given by the expression f = 
y grad ®, where y is a constant independent of the masses and of their 
positions. 

For masses that are not concentrated at single points but are 
distributed continuously with density (E, n, 6) over a definite por- 
tion R of €, n, ¢-space, we defined the potential of this mass-distribu- 
tion to be 


(34a) ® ={{ f + dé dn dt. 


If the masses are distributed over a surface S with surface density 
u, the potential of this surface is the surface integral 


(34b) {J Mae do 


taken over the surface S with surface element do. 
For the potential of a mass distributed along a curve, we likewise 
obtain an expression of the form 


(340) f PS) d, 


1We could call this a potential of the mass. Any function obtained by adding an arbi- 
trary constant to this could equally well be called a potential of the mass, since it 
would give the same field of force. 
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where s is the length of arc on this curve and p (s) is the linear density 
of the mass. 

For every such potential the level surfaces of ® defined by ® = 
constant represent the equipotential surfaces.} 

One example of the potential of a line-distribution is that of a 
mass of constant linear density u distributed along the segment 
—l < z < +lof the z-axis. We consider a point P with coordinates 
(x, y) in the plane z = 0. For brevity we introduce p = vx? + y?, the 
distance of the point P from the origin. The potential at P is then 


+1 dz 
O(x, y) = p {, Vere +C 


Here we have added a constant C to the integral, which does not 
affect the field of force derived from the potential. The indefinite 
integral on the right can be evaluated as in Volume I [p. 270 (26)], and 
we obtain 


las r si p og 5 , 


so that the potential in the x, y-plane is given by 
+c. 


Da, y) = 2n log LEIFER 


To obtain the potential of a line extending to infinity in both 
directions, we give the value — 2u log 2l to the constant? C and thus 
obtain 


P(x, y) = 2p log 


If we now let the length / increase without limit, that is, if we let 
the length of the line tend to infinity, the expression {l + vV + p?}/2l 


1Curves that at every point have the direction of the force vector are called lines of 
force. Since the force here has the direction of the gradient of ®, the lines of force are 
curves that everywhere intersect the level surfaces at right angles. We thus see that 
the families of lines of force corresponding to potentials generated by a single pole or 
by a finite number of poles run out from these poles as if from a source. In the case of 
a single pole, for example, the lines of force are simply the straight lines passing 
through the pole. 

2We make this choice in order that in the passage to the limit l—oo the potential ® 
shall remain finite. 
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tends to unity, and for the limiting value of ®(x, y) we obtain the 
expression 


(35a) D(x, y) = —2p log p. 
We thus see that, apart from the factor —2p, the expression 
(35b) log p = log vx? + y? 


is the potential of a straight line perpendicular to the x, y-plane over 
which a mass is distributed uniformly. The equipotential surfaces 
here are the circular cylinders 


p = vx? + y? = constant. 


On p. 441 we already calculated the potential of a spherical surface 
of constant density (i.e., mass per unit area) u. We found that for 
a sphere of radius a and center at the origin the potential ® at a point 
P = (x, y, z) is given by 


2 
(36a) = — u (r>a) 
(36b) ® = 4nap (r <a) 
where 
(36c) r= F y F 22 


is the distance of P from the origin. The potential of a solid sphere of 
density » can be obtained by decomposing the ball into spherical 
surfaces of radius a and surface density u da. Accordingly, the 
potential of a solid sphere of radius A is obtained from formulae (36a, b) 
by integrating with respect to a from 0 to A. One finds (cf. p. 442) 
that 


3 

(37a) © = me p (r> A) 
r 

(37b) © = (2r A? — A nr?) y (r < A). 


The corresponding gravitational force 


(37c) f = y grad ø 
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exerted by the solid sphere on a unit mass at P is directed toward the 
origin and has magnitude 


4TA?’ 
3r? 


Anr 


(37d) 3 


yu for r> A, yu for r< A. 


In addition to the distributions previously considered, potential 
theory also deals with so-called double layers, which we obtain in the 
following way: We suppose that point charges M and — M are located 
at the points (&, n, &) and (€ + h, n, ©), respectively. The potential of 
this pair of charges is given by 


o- ——— Mo 

v(x — E£ + (y — n} + (2 — 0)? 
oo M 
v(x — EF — hy + (y — 2 + @ — OP 


If we let h, the distance between the two poles, tend to zero and at 
the same time let the charge M increase indefinitely in such a way 
that M is always equal to —u/h, where p is a constant, ® tends to the 


limit 
ae 
ArI 


We call this expression the potential of a dipole or doublet with its 
axis in the -direction and with “moment” u. Physically it represents 
the potential of a pair of equal and opposite charges lying very close 
to one another. In the same way, we can express the potential of a 
dipole in the form 


where 3/ðv denotes differentiation in an arbitrary direction v, that 
of the axis of the dipole. 

If we imagine dipoles distributed over a surface S with moment- 
density u and if we assume that at each point the axis of the dipole 
is normal to the surface, we obtain an expression of the form 


SJ r&no él ao, 
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where d/dv denotes differentiation in the direction of the normal 
to the surface (we can, as before, choose either direction for the 
normal) and r is the distance of the point (€, n, ¢) that ranges over the 
surface from the point (x, y, z). This potential of a double layer can be 
thought of as arising in the following way: On each side of the sur- 
face and at a distance h we construct surfaces, and we give one of 
these surfaces a surface-density »/2h and the other a surface-density 
—p/2h. At an external point these two layers together create a po- 
tential that tends to the expression above as h > 0. 


b. The Differential Equation of the Potential 


We shall assume that in all our expressions the point (x, y, 2) con- 
sidered is at a point in space at which no charge is present, so that 
the integrands and their derivatives with respect to x, y, z are con- 
tinuous. By virtue of this hypothesis we can obtain a relation that 
all the foregoing potentials satisfy, namely, Laplace’s differential 
equation 


(38a) Orr + Dyy + Dz = 0, 
which is abbreviated 
(38b) A® = 0. 


As can easily be verified by simple calculation (p. 59), this equa- 
tion is satisfied by the expression 1/r. It therefore holds also for all 
the other expressions formed from 1/r by summation or integration, 
since we can perform the differentiations with respect to x, y, z under 
the integral sign.! This differential equation is also satisfied by the 
potential of a double layer, for by virtue of the reversibility of the 
order of differentiation? we find that for the potential of a single dipole 
the equation 


1Observe that the differentiation under the integral sign is only legitimate as long as 
r + 0, that is in regions where no charge is present. Laplace’s equation does not have 
to hold otherwise. For example, within a solid sphere, its potential satisfies, by (37b), 
the equation 


A® = A(2nA?2 — = nr?ju = — 4nu Æ 0. 


2Note that the differentiation 0/dv refers to the variables (E, n, ¢) and the expression 
A to the variables (x, y, z). Incidentally, the function 1/r, considered as a function of 
the six variables (x, y, z; &, n, ¢), is symmetrical in the two sets of variables and there- 
fore satisfies the Laplace equation 

Dit + Onn + OH = 0 
with respect to the variables (E, n, ¢) also. 
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(38c) as( Int uo 


holds. 


Laplace’s equation is also satisfied by the expression log vx? + y? 
obtained for the potential of a vertical line, as we can readily verify 
(cf. also Chapter 5, p. 569). Since this no longer depends on the variable 
z, it also satisfies the simpler Laplace’s equation in two dimensions, 


(38d) Drs + Dyy = 0. 


The study of these and ‘related partial differential equations forms 
one of the most important branches of analysis. We point out that 
potential theory is not by any means chiefly directed to the search 
for general solutions of the equation A® = 0 but rather to the ques- 
tion of the existence and to the investigation of those solutions that 
satisfy preassigned conditions. Thus, a central problem of the theory 
is the boundary value problem, in which we seek a solution ® of 
A® = 0 that, together with its derivatives up to the second order, 
is continuous in a region R and that has preassigned continuous 
values on the boundary of R. 


c. Uniform Double Layers 


We cannot enter here into a detailed study of potential functions,} 
that is, of functions that satisfy Laplace’s equation Au = 0. In this 
subject Gauss’s theorem and Green’s theorem (pp. 601, 608) are 
among the chief tools employed. It will be sufficient to show by some 
examples how such investigations are carried out. 

We shall first consider the potential of a double layer with constant 
moment-density u = 1, that is, an integral of the form 


(39) V= JJ. 2 (=| do. 


This integral has a simple geometrical meaning. Let us assume that 
each point of the surface carrying the double layer can be “seen” 
from the point P with coordinates (x, y, z), meaning that it can be 
joined to this point P by a straight line that meets the surface nowhere 
else. The surface S, together with the rays joining its boundary to the 
point P, forms a conical region R of space. We now state that the 


lalso called harmonic functions. 
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potential of the uniform double layer, except perhaps for sign, is equal 
to the solid angle that the boundary of the surface S subtends at the 
point P. By this solid angle we mean the area of that portion of the 
spherical surface of unit radius about the point P as center that is cut 
out of the spherical surface by the rays going from P to the boundary 
of S. We give this solid angle the positive sign when the rays pass 
through the surface S in the same direction as the positive normal 
v, otherwise we give it the negative sign. 

To prove this, we recall that the function u = 1/r, when considered 
not only as a function of (x, y, z) but also as a function of (&, n, 6) 
still satisfies the Laplace equation 


Au = utg + Unn + Ug = 0. 


We fix the point P with coordinates (x, y, z) and denote the rectangular 
coordinates in the conical region R by (&, n, ¢); we use a small sphere 
of radius p about the point P to cut off the vertex from R; the residual 
region we call Rp. To the function u = 1/r, considered as a function 
of (E€, n, 6) in the region Rp, we now apply Green’s theorem (Chapter 
5, p. 608) in the form 


ou 
Au dé dndt= Í =— do. 
{ffi au de dn at son” 


Here S’ is the boundary surface of Rp and d/dn denotes differentiation 
in the direction of the outward normal. Since Au = 0, the left side is 
zero.! If we have chosen the positive normal direction v on S so as to 
coincide with the outward normal n, the surface integral on the right 
side consists of three parts: (1) the surface integral 


(Lz a= ff Za 


over the surface S, which is the expression V considered in (89); (2) 
an integral over the lateral surface formed by the linear rays; (3) an 
integral over a portion Ip of the surface of the small sphere of radius 
p. The second part is zero, since there the normal direction n is per- 


1From this form of Green’s theorem it follows in general that the surface integral 
ðu 
Í an do 
taken over a closed surface must always vanish when the function u satisfies 
Laplace’s equation Au = 0 everywhere in the interior of the surface. 
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pendicular to the radius, and therefore is tangential to the sphere 

= constant. For the inner sphere with radius p the symbol 0/dn is 
equivalent to —d/dp, since the outward direction of the normal points 
in the direction of diminishing values of r. We thus obtain the 
equation 


or 


where on the right we have to integrate over the portion Ip of the 
small spherical surface that belongs to the boundary of Rp. We now 
write the surface element on the sphere with radius p in the form 
do = p? dw, where do is the surface element on the unit sphere, to 


obtain 
V = — f da. 


The integral on the right is to be taken over the portion of the spheri- 
cal surface of unit radius lying in the cone of rays, and we see at once 
that the right side has the geometrical meaning stated above; it is the 
negative of the apparent angular magnitude if the normal direction on 
S is chosen so that it points outward! from the conical region R. 
Otherwise, the positive sign is to be taken. 

If the surface S is not in the simple position relative to P described 
above but instead is intersected several times by some of the rays 
through P, we have only to divide the surface into a number of por- 
tions of the simpler kind in order to see that the statement still holds 
good. The potential of the uniform double layer (of moment 1) on a 
bounded surface is therefore, except perhaps for sign, equal to the 
“apparent” magnitude that the boundary has when looked at from the 
point (x, y, 2). 

= For a closed surface we see by subdividing it into two bounded 
portions that our expression is equal to zero if the point P is outside 
and equal to —4z if it is inside. 


1The negative sign is explained by the fact that with this choice of the normal direct- 
tion the negative charge lies on the side of the surface facing the point P. 
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A similar argument shows in the case of two independent varia- 
bles that the integral 


ð 
f Ay (log r) ds 


along the curve C, except possibly for sign, is equal to the angle that 
this curve subtends at the point P with the coordinates (x, y). 

This result, like the corresponding result in space, can also be 
explained geometrically as follows. Let the point Q with the coordi- 
nates (€, n) lie on the curve C. Then the derivative of log r at the point 
Q in the direction of the normal to the curve is given by the equation 


> (log r) = ar ô. dog r) cos (v, r) = — cos (v, r), 


where the symbol (v, r) denotes the angle between this normal and the 
direction of the radius vector r. On the other hand, when written in 
polar coordinates (r, 9), the element of arc ds of the curve has the 
form 


_ yew go —TY*? + y? _ rd 
ds = yx? + y? dð = — Sat xy” ® cos (v, 7) 


(cf. Volume I, p. 351), so that the integral is transformed as follows: 


fz (log r) ds = if cos (v, r) ——— r ao = fao. 


cos (v, r) 


The final integral on the right is the analytical expression for the 
angle. 


d. The Mean Value Theorem 


As a second application of Green’s transformation, we prove the 
following mean value property of potential functions: 

Let u satisfy the differential equation Au = 0 in a certain region 
R. Then the value of the potential function at the center P of an arbi- 
trary solid sphere of radius r lying completely in the region R is equal 
to the mean value of the function u on the surface Sr of the sphere; that 
is, 


1 
(40a) u(x, y, Z) = Inr in udo, 
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where u(x, y, z) is the value at the center P and i the value on the sur- 
face S, of the sphere of radius r. 

To prove this we proceed as follows: Let Sp be a sphere concentric 
to, and inside of, S, with radius 0 < p < r. Since Au = 0 everywhere 
in the interior of Sp, by the footnote on p. 720 we have 


where du/dn is the derivative of u in the direction of the outward 
normal to Sp. If (€, n, ¢) are running coordinates and if with the point 
(x, y, z) as pole we introduce spherical coordinates by the equations 


E — x = p cos ø sin 8, n—y=psingsin®#, (—z2=pcos8, 


the above equation becomes 


f du(P, ® 8) 1, = 9 
Sp dp 


Since the surface element do of the sphere Sp is equal to p? do, where 


d& is the element of surface of the sphere S of unit radius (cf. (30e) 
p. 429), we find that 


where the region of integration no longer depends on p. Consequently, 


r Ou 
ap [f as =o, 
if P s Op 


and on interchanging the order of integration and performing the 
integration with respect to p, we have 


ff. {u(r, 0, g) ~~ u(0, 0, 9)} do = 0. 
Since u(0, 9, d) = u(x, y, z) is independent of 8 and ¢, 


Í ji 5 u(r, 9, ġ) do = u(x, y, 2) Jf d& = 4nu(x, y, 2). 


Because 


724 Introduction to Calculus and Analysis, Vol. II 


JJ. u(r, 0, 6) do = aS. u(r, 9, d) do, 


where the integral on the right is to be taken over the surface of S,, 
the mean value property of u is proved. 

In exactly the same way, a function u of two variables that satisfies 
Laplace’s equation Uzr + Uyy = 0 has the mean value property 
expressed by the formula 


(40b) Qnru(x, y) = f a ds, 
Sr 


where ä denotes the value of the potential function on a circle S, 
with radius r centered at the point (x, y) and ds is the element of arc 
of this circle. 


e. Boundary Value Problem for the Circle. Poisson’s Integral 


A boundary value problem that we can treat rather completely is 
that of Laplace’s equation in two independent variables x, y for the 
case of a circular boundary. Within the circular region x? + y? < R? 
we introduce polar coordinates (r, 9). We wish to find a function 
u(x, y) continuous within the circle and on the boundary, possessing 
continuous derivatives of the first and second order within the region, 
satisfying Laplace’s equation Au = 0, and having prescribed values 
u(R, 0) = f(8) on the boundary. Here we assume that f(0) is a 
continuous periodic function of 8 with sectionally continuous first 
derivatives. 

The solution of this problem, in terms of polar coordinates, is given 
by the so-called Poisson integral: 


_R-r ¢ f(a) 
(41) U= On J R? — 2Rr cos (0 — 0) + r °A 


To prove this, we begin by constructing special solutions of 
Laplace’s equations in the following way. We transform Laplace’s 
equation to polar coordinates, obtaining 


1 1 
Au = — (rur)r + 73 uoo = 0, 


and seek solutions that can be expressed in the “separated”? form 
u = d(r) w(6), that is, as a product of a function of r and a function 
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of 0. If we substitute this expression for u in Laplace’s equation, the 
equation becomes 


gh O 
g(r) TOJ 


Since the left side does not involve 8 and the right side does not in- 
volve r, the two sides must each be independent of both variables, 
that is, must be equal to the same constant k. Accordingly, w(8) 
satisfies the differential equation y” + ky = 0. 

Since the function u and, hence, y(@) must be periodic with period 
2x, the constant k is equal to n?, where n is an integer. Hence, 


w(8) = a cos nð + b sin nð, 


where a and b are arbitrary constants. 
The differential equation for g(r), 


r°” (r) + rg'(r) — n’g(r) = 0, 


is a linear differential equation, and as we can immediately verify, 
the functions r” and r~” are independent solutions. Since the second 
solution becomes infinite at the origin, while u is to be continuous 
there, we are left with the first solution ¢ =r” and obtain the 
separated solutions of Laplace’s equation 


r(a cos nð + b sin n6). 


We can now generate other solutions by linear combination of such 
solutions according to the principle of superposition (cf. p. 684) 


5 ao + >) r™an cos nO + bn sin n8). 


Even an infinite series of this form will be a solution, provided that 
the series converges uniformly and can be differentiated term by 
term twice in the interior of the circle. 

The Fourier expansion of the prescribed boundary function f(9) 


f(8) = + ao + > (an cos nO + bn sin n), 
2 n=l 


regarded as a series in 9, certainly converges absolutely and uniformly 
(cf. Volume I, p. 604). Hence, the series 
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u(r, 9) = = = a + È i ~ (an cos nð + bn sin n0) 


a fortiori converges uniformly and absolutely in the interior of the 
circle. This series, however, can be differentiated term by term, 
provided r < R, because the resulting series again converge uni- 
formly (cf. Volume I, p. 539). The function u(r, 0) is, therefore, a 
potential function. Since it has the prescribed value on the boundary, 
it is a solution of our boundary value problem. 

We can reduce this solution to the integral form (41) by introducing 
the integrals for the Fourier coefficients, 


2r an 
n= as f(a) cos na da, bn = = J. f(a) sin na da. 


Since the convergence is uniform, we can interchange integration 
and summation and obtain 


u(r, 9) = 1f” f(a) LS + 2 T Cos n(ð — a)| da. 


Poisson’s integral formula will be proved if we can establish the 
relation 


1 +S Z cosm= i py k __ 
2 {a Rn ~ 2 R — V2Rr cost +r?’ 


But this can be proved by the method used in Volume I (p. 586), that 
is, by reduction to a geometric series, using the complex represen- 
tation 


cos nt = > (eint + e~m), 
We leave the details of the proof to the reader. 


Exercises 6.7 


1. By applying inversion to Poisson’s formula, find a potential function 
u(x, y) that is bounded in the region outside the unit circle and assumes 
given values f (0) on its boundary (the so-called outer boundary value prob- 
lem). 

2. Find (a) the equipotential surfaces and (b) the lines of force for the 
potential of the segment x = y = 0, — l <z < + l, of constant linear 
density wu. 
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3. Prove that if the values of a harmonic u(x, y, z) and of its normal deriva- 
tive 0u/dn are given on a closed surface S, then the value of u at any 
interior point is given by the expression 


sesasi S, (L$ AD) a 


where r is the distance from the point (x, y, z) to the variable point of in- 
tegration (apply Green’s theorem to the functions u and 1/r). 


6.8 Further Examples of Partial Differential Equations from 
Mathematical Physics 


a. The Wave Equation in One Dimension 


The phenomena of wave propagation (e.g., of light or sound) are 
governed by the so-called wave equation. We begin by considering the 
simple idealized case of a so-called one-dimensional wave. Such a 
wave involves the magnitude u of some property—for example, pres- 
sure, position of a particle, or intensity of an electric field—which 
depends not only on the coordinate of position x (we take the direc- 
tion of propagation as the x-axis) but also on the time t. 

A wave function u(x, t) then satisfies a partial differential equation 
of the form 


1 
(42a) Use = P Uit, 


where a is a constant depending on the physical nature of the me- 
dium. 
We can find solutions of equation (42a) of the form 


u = f(x — at), 


where /(€) is an arbitrary function of &, which we only assume to have 
continuous derivatives of the first and second order. If we put € = 
x — at, we see at once that our differential equation is actually satisfied, 
for 


Ure = f'(5), Ue = a?f'’(6). 


In the same way, using an arbitrary function g(&), we obtain a solu- 
tion of the form 


1For example, for transverse vibrations of a string, u represents the lateral displace- 
ment of a particle, and a? = T/p, where T is the tension and p the mass per unit 
length. 
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u = g(x + at). 


Both solutions represent wave motions propagated with the ve- 
locity a along the x-axis; the first represents a wave traveling in the 
positive x-direction, the second a wave traveling in the negative x- 
direction. Let u = f(x — at) have the value u(x1, tı) at any point xı 
at time tı; then u has the same value at time ¢ at the point x = xı — 
a(t — tı), for then x — at = x1 — ati, so that f(x — at) = f(x1 — atı). 
In the same way we can see that the function g(x + at) represents 
a wave traveling in the negative x-direction with velocity a. 

We shall now solve the following initial value problem for this wave 
equation. From all possible solutions of the differential equation we 
wish to select those for which the initial state (at t = 0) is given by 
two prescribed functions u(x,0) = ¢(x) and u:(x,0) = w(x). To solve 
this problem, we merely write 


(42b) u = f(x — at) + g(x + at) 
and determine the functions f and g from the two equations 
(x) = f(x) + g(x), 


1 / / 
— w(x) = — f(a) + g'o). 
The second equation gives 
e+ [wade = = fee) + a 


where c is an arbitrary constant of integration. From this we readily 
obtain the required solution in the form 


g(x + at)+¢(x—at), 1 (7 y(t) dr. 


+ 


(42c) u(x, t) = 9 2a Joos 


The reader should prove for himself, by introducing new independ- 
ent variables € = x — at, n = x + at instead of x and t, that no 
solutions of the differential equation exist other than those given. 


b. The Wave Equation in Three-Dimensional Space 


In space of three dimensions the wave function u depends on four 
independent variables, namely, the three space coordinates x, y, z 
and the time t. The wave equation is then 
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1 
(48a) Ure + Uyy + Uzz = a Utt, 
or, more briefly, 


(43b) Au = 4, Utt. 


Here again we can easily find solutions that represent the prop- 
agation of a plane wave in the physical sense. Namely, any function 
f(&) that is twice continuously differentiable yields a solution of the 
differential equation if we make € a linear expression of the form 


E = ax + By + yz + at, 

whose coefficients satisfy the relation 

a? + B2 + y2 = 1. 
For, since 

Au = (0? + B + FO = f" 
and 
un = a? f", 
we see that u = f(ax + By + yz + at) really is a solution of the 
equation (43b). 
If q is the distance of the point (x, y, z) from the plane ax + By + yz 

= 0, we know by analytical geometry (cf. p. 135) that 

q = ax + By + yz. 


Hence, in the first place, we see from the expression 
u = f(q + at) 


that at all points of a plane at a distance q from the plane ax + By + 
yz = 0 and parallel to it the property that is being propagated (rep- 
resented by u) has the same value at a given moment. The property 
is propagated in space in such a way that planes parallel to ax + 
By + yz = 0 are always surfaces on which the property is constant; 
the velocity of propagation is a in the direction perpendicular to the 
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planes. In theoretical physics a propagated phenomenon of this kind 
is referred to as a plane wave. 

A case of particular importance is that in which the property varies 
periodically with time. If the frequency of the vibration is œ, a phe- 
nomenon of this kind may be represented by 


u = exp[ik(ax + By + yz + at)] = exp[ik(ax + By + yz)] exp(it), 


where k/2n is the reciprocal of the wavelength A: k = 2r/ù = ofa. 

The wave equation with four independent variables has other 
solutions, which represent spherical waves spreading out from a given 
point, say the origin. A spherical wave is defined by the statement that 
the property is the same at a given instant at every point of a sphere 
with its center at the origin, that is, that u has the same value at 
all points of the sphere. To find solutions satisfying this condition, 
we transform Au to polar coordinates (r, 9, 6), and then assume that 
u depends only on r and ¢ but not on 0 and ø. If we accordingly 
equate the derivatives of u with respect to 0 and ¢ to zero (cf. p. 610), 
the differential equation (48b) becomes 


2 1 
Urr + — Ur = —; Utt 
rr r“ a2 


or 
1 
(ru)rr = 2 (ru)tt. 


For the moment we replace ru by w and observe that w isa solution 
of the equation 


1 
Wrr = a? Wit, 


which we have already discussed; hence, w must be expressible in the 
form 


w = f(r — at) + g(r + at). 


Consequently, 


(48c) u = L [f(r — at) + g(r + a]. 
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The reader should now verify for himself directly that a function of 
this type is actually a solution of the differential equation (43b). 

Physically the function u = f(r — at)/r represents a wave prop- 
agated with velocity a from a center outward into space. 


c. Maxwell’s Equations in Free Space 


As a concluding example we shall discuss the system of equations 
known as Maxwell’s equations, which form the foundations of 
electrodynamics. However, we shall not attempt to approach the 
equations from the physical point of view but shall merely use them 
to illustrate the various mathematical concepts developed above. 

The electromagnetic state in free space is determined by two 
vectors given as functions of position and time, an electric vector 
E with components £, E2, E3 and a magnetic vector H with com- 
ponents Hı, Hz, Hz. These vectors satisfy Maxwell’s equations: 


(44a) curl E + cu 0, 
1 OE 
(44b) curl H — <z a” 0, 


where c is the velocity of light in free space. Expressed in terms of 
the components of the vectors, the equations are: 


Ez dke 1 Off, _ 
dy dtc a” 


dfx OKs 1 0H2 
az Ox c oat ‘ 


3E ðE, . 1 dHs 
Ox ay c ot 


and 


0H3 dH 1 oki 


ay dz c a” 
0M, 0H3_ 1 dE2 _ 9 
dz ox cot’ 
0H, OF 1 dks _ o 
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We thus have a system of six partial differential equations of the 
first order, that is, of equations involving the first partial derivatives 
of the components with respect to the space coordinates and to the 
time. 

We shall now deduce some distinctive consequences of Maxwell’s 
equations. If we form the divergence of both equations, and remember 
that div curl A = 0 (see p. 211) and that the order of differentiation 
with respect to the time and formation of the divergence is inter- 
changeable, we ohtain from (44a, b) 


(45a) div E = constant, 
(45b) div H = constant; 


this is, the two divergences are independent of the time. In particular, 
if initially div E and div H are zero, they remain zero for all time. 
We now consider any closed surface S lying in the field and take 


the volume integrals 
Í Í ji div E dt 


and 


{f div H dt 


throughout the volume enclosed by it. If we apply Gauss’s theorem 
(p. 601) to these integrals, they become integrals of the normal 
components En, Hn over the surface S. That is, the equations 


div E = 0, div H = 0 
give 
[J E» do = 0, Jf Ha do = 0. 


In electrical theory, surface integrals 


[f E do or |) Hs do 


are called the electric or magnetic flux across the surface S, and our 
result may accordingly be stated as follows: 
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The electric flux and the magnetic flux across a closed surface, 
subject to the zero initial conditions on div E and div H, are zero. 

We obtain a further deduction from Maxwell’s equations if we 
consider a portion of surface S bounded by the curve T, as follows: 


If we denote the components of a vector normal to the surface S by 
the suffix n, it immediately follows from Maxwell’s equations (44a, b) 
that | 


(curl E)n = — — “2. 
(curl Hja = + — —. 


If we integrate these equations over the surface with surface element 
do, we can transform the left sides into line integrals taken round the 
boundary T by Stokes’s theorem (cf. p. 611). Doing this, and taking 
the differentiation with respect to ¢ outside the integral sign, we 
obtain the equations 


__lid ff 
f E ds = P dt ç Hn do, 


_,id ff 
jm ds = + c dt Er do, 


where the symbols Es and Hs under the integral signs on the left 
are the tangential components of the electric and magnetic vectors in 
the direction of increasing arc and the sense of description of the 
curve I’ in conjunction with the direction of the normal n forms a 
right-handed screw. 

The facts expressed by these equations may be expressed in words 
as follows: 


The line integral of the electric or the magnetic force round an 
element of surface is proportional to the rate of change of the electric 
or magnetic flux across the element of surface, the constant of propor- 
tionality being —1/c or + 1fc. 

Finally, we shall establish the connection betweene Maxwell’s 
equations and the wave equation. We find, in fact, that each of the 
vectors E and H, that is, each component of the vectors, satisfies the 
wave equation 


1 
Au = -2 Utt. 
C 
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To show this, we eliminate the vector H, say, from the two equations, 
by differentiating the second equation with respect to the time and 
substituting for dH/dt from the first equation. 

It then follows that 


1 2E 
c curl (curl E) + ° OF 0. 


If we now use the vector relation! 
(46) curl (curl A) = — AA + grad(div A), 
and recall that 

div E = 0, 


we at once obtain 
(47a) AE = -— =>. 


In the same way we can show that the vector H satisfies the same 
equation: 


(47b) AH = — <=. 


Exercises 6.8 


1. Integrate the following partial differential equations: 
(a) Uszy = 0 
(b) uzyz = 0 
(c) Uszy = a(x, y). 
2. Find a solution of the equation 
Ury = U, 
for which u(x, 0) = u(0, y) = 1, in the form of a power series. 


3. Find the partial differential equation satisfied by the two-parameter 
family of spheres 


z? = 1 — (x — a)? — (y — b). 
4. Prove that if 


1This vector relation follows immediately from its expression in terms of coordinates. 
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z= u(x, y, a, b) 


is a solution depending on two parameters a, b, of the partial differential 
equation of the first order 


F(x,y, Z, ZT, Zy) = 0, 
then the envelope of every one-parameter family of solutions chosen 
from z = u(x, y, a, b) is again a solution. 
. (a) Find particular solutions of the equation 
Uz? + Uy? = 
of the form u = f(x) + g(y). 
(b) Find particular solutions of the equation 
Usclly = 1 
of the forms u = f(x) + g(y) and u = f(x) g(y). 


(c) Use the result of Exercise 4 to obtain other solutions of the equa- 
tion in part (b) by putting b = ka in 


— i 
u = ax + a2 + b, 


where k is a constant. 
. Solve the equation 
Ure + DUzy + OlUyy = e*tY 
by reducing it to one of the form of Exercise 1(c). 
. Prove that if K is a homogeneous function of x, y, z the equation 
ie (Ke) + ay E3) + ae (Ge) =° 
has a solution that is a power of (x? + y? + 2°). 
. Determine the solutions of the equation 
az _ Oz 
at ax? 


that are also solutions of 


az)? _ a2 (22)? 
ot} k l 
. (a) Obtain particular solutions of the wave equation 


_1 
Urr: = -3 Utt 
C 


in the form u(x, t) = ġ(x)}(t) satisfying the boundary conditions 
u(0, t) = u(r, t) = 0. 
(b) Express the solution of part (a) in the form f(x + ct) + g(x — ct). 


(c) Plucked string problem: By expanding f(x) over the interval [0, 7] 
in a Fourier sine series (which defines f(—x) = —f (x) for 0 < x < 7), 
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find a solution of the foregoing type that satisfies the initial con- 
ditions, for 0 <x <r, 
u(x,0) = f(x) 
ur(x,0) = 0, 
where 
x, 0 <x <n/2 


(i) f(x) = | 


n— x, T2 Sx ír 
(ii) f (x) =% æn sin nx. 
10 . Let u(x, t) denote a solution of the wave equation 
Use = 3, Utt (a > 0) 


that is twice continuously differentiable. Let ¢(£) be a given function 
that is twice continuously differentiable and such that 


$(0) = $0) = $” 0) = 0. 


Find the solution u for x = 0 and t = 0 that is determined by the bound- 
ary conditions 


u(x,0) = ue(x,0) = 0 (x = 0), 
u(0,t) = (£) (t = 0). 


CHAPTER 
| 


Calculus of Variations 


7.1 Functions and Their Extrema 


In the theory of ordinary maxima and minima of a differentiable 
function f(x1,..., Xn) of n independent variables, the necessary 
condition (pp. 326-7) for the occurrence of an extreme value at a 
point of the domain of f is 


(1) df=0 or sradf=0O or fz =0 (G=1,...,n). 


These equations express the stationary character of the function f at 
the point in question. Whether these stationary points are actually 
maximum or minimum points can only be decided upon further in- 
vestigation. In contrast to the equations (1), sufficient conditions for 
extrema take the form of inequalities (see p. 349). 

The calculus of variations is likewise concerned with the problem 
of extreme values (respectively stationary values) but in a completely 
new situation. Now the functions whose extrema we seek no longer 
depend on one independent variable or a finite number of independent 
variables within a certan region but are so-called functionals, or 
functions of functions. Specifically, in order to determine them we 
must know one or more functions or curves (or surfaces, as the case 
may be), the so-called argument functions. 

General attention was first drawn to problems of this type in 1696 
by John Bernoulli’s statement of the brachistochrone problem. 

In a vertical x, y-plane a point A = (xo, yo) is to be joined to a point 
B = (xı, yı), such that xı > xo, yı > yo, by a smooth curve y = u(x) 
in such a way that the time taken by a particle sliding without friction 
from A to B along the curve under gravity (which is taken as acting 
in the direction of the positive y-axis) is as short as possible. 


737 
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The mathematical expression of the problem is based on the physi- 
cal assumption that along such a curve y = ¢(x) the velocity ds/dt 
(s being the length of arc of the curve) is proportional to V2g(y — yo), 
the square root of the height of fall. The time taken in the fall of the 
particle is therefore given by 


_ (dt ds, _ Vi + y2 
T= a as dx 2*7 z J Vy yee 


(cf. Volume I, p. 408). If we drop the unimportant factor /2g and take 
yo = 0 (which we can do without loss of generality), we obtain the 
following problem: Among all continuously differentiable functions 
y = ġ(xo), y= 0 for which (xo) = 0, d(x1) = yı, find the one for 
which the integral 


(2a) rg = J" Pt as 


has the least possible value. 

On p. 751 we shall obtain the result—very surprising to Bernoulli’s 
contemporaries—that the curve y = ¢(x) must be a cycloid. Here 
we wish to emphasize that Bernoulli’s problem and the elementary 
problems of maxima and minima are quite different. The expression 
I {¢} depends on the whole course of the function ¢. Since ¢ cannot 
be described by the values of a finite number of independent variables, 
Tis a function of a new kind. We indicate its character of “function 
of a function g(x)” by means of braces. 

The following is another problem of a similar nature: Two points 
A = (xo, yo) and B = (x1, yi), where xı > xo, yo > 0, yı > 0, are to be 
joined by a curve y = u(x) lying above the x-axis, in such a way that 
the area of the surface of revolution formed when the curve is rotated 
about the x-axis is as small as possible. 

Using the expression given on p. 429 for the area of a surface of 
revolution and dropping the unimportant factor 2r, we have the 
following mathematical statement of the problem: Among all con- 
tinuously differentiable functions y = d(x) for which ¢(xo) = yo, 
d(x1) = yı, g(x) > 0, find the one for which the integral 


(2b) Ii} = | y VIF yda [y = go) 


has the least possible value. It will be found that the solution is a 
catenary. 


Calculus of Variations 739 


The elementary geometrical problem of finding the shortest curve 
joining two points A and B in the plane belongs to the same category. 
Analytically, the problem is that of finding two functions x(t), y(t) 
of a parameter t in an interval to < t < ti, for which the values 
x(to) = xo, x(ti1) = xı and y(to) = yo, y(tı) = yı are prescribed and for 
which the integral 


(2c) fS? RR at (t= F 5 = BI 


has the least possible value. The solution is, of course, a straight 
line. 

Less trivial is the solution of the corresponding problem of finding 
the geodesics on a given surface G(x, y, z) = 0, that is, of joining two 
points on the surface with coordinates (xo, yo, zo) and (x1, y1, 21) by the 
shortest possible curve lying in the surface. In analytical language, 
we have the following problem: Among all triads of functions x(t), 
y(t), z(t) of the parameter t that make the equation 


(3a) G(x, y, 2) = 0 


an identity in t and for which x(to) = xo, y(to) = yo, 2(to) = Zo and x(t1) 
= x1, y(ti) = yı, 2(t1) = zı, find that for which the integral 


(3b) f k VFJ FË dt 


has the least possible value. 

The isoperimetric problem of finding a closed curve of given length 
enclosing the largest possible area, already discussed on p. 366, 
also belongs to the same category. We have proved above that the 
solution is a circle.! 

The general formulation of the type of problem encountered here 
is as follows: We are given a function F(x, ¢, ¢’) of three arguments 


1The proof given there applied only to convex curves; the following remark,however, 
enables us to extend the result immediately to any curve: We consider the convex 
hull of the curve C (i.e., the smallest convex set enclosing C). Its boundary K consists 
of convex arcs of C and rectilinear portions of tangents to C that touch C at two 
points and bridge over concave parts of C by straight lines. It is evident that the area 
of K exceeds that of C, provided C is not convex, and, on the other hand, that the 
perimeter of K is less than that of C. If we nowmake K expand uniformly so that it 
always retains the same shape, until the resulting curve K’ has the prescribed per- 
imeter, K’ will be a curve of the same perimeter as C but enclosing a greater area. 
Hence, in the isoperimetric problem we may from the outset confine ourselves to 
convex curves, in order to obtain the maximum area. 
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that in the region of the arguments considered is continuous and 
has continuous derivatives of the first and second orders. If in this 
function F we replace ¢ by a function y = ¢(x) and @’ by the de- 
rivative y’ = ¢'(x), F becomes a function of x, and an integral of the 
form 


(4) I} = f Fl, y, y’) dx 


becomes a definite number depending on the function y = g(x); 
that is, it is a “functional evaluated for the function g(x).” 

The fundamental problem of the calculus of variations is the 
following: 


Among all the functions that are defined and continuous and possess 
continuous first and second derivatives in the interval xo S x S xı 
and for which the boundary values yo = ġ(xo) and yı = ¢(x1) are 
prescribed find the one for which the functional I{¢} has the least 
possible value (or the greatest possible value). 

In discussing this problem, an essential point is the nature of the 
admissibility conditions imposed on the functions g(x). Forming the 
value I{¢} merely requires that when g(x) is substituted, F shall 
be a sectionally continuous function of x, and this is assured if the 
derivative g'(x) is sectionally continuous. But we have made the 
conditions for admission more stringent by requiring that the first 
derivatives, and even the second derivatives, of the functions g(x) 
shall be continuous. The field in which the maximum or minimum is 
to be sought is of course thereby restricted. It will, however, be found 
that this restriction does not, in fact, affect the solution, that is, that 
the function that is most favorable when the wider field is available 
will always be found in the more restricted field of functions with 
continuous first and second derivatives. 

Problems of this type occur very frequently in geometry and 
physics. Here we mention only one example: the fundamental princi- 
ple of geometrical optics. We consider a ray of light in the x, y-plane 
and assume that the velocity of light is a given function v(x, y, y’) 
of the point (x, y) and of the direction y’ [y = g(x) being the equation 
of the light-path and y’ = g'(x) the corresponding derivative]. Then 
Fermat’s principle of least time states: 


The actual path of a ray of light between two given points A, B is 
such that the time taken by the light in traversing it is less than the 
time that light would take to traverse any other path from A to B. 
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In other words, if t is the time and s the length of arc of any curve 
y = d(x) joining the points A and B, the time that light would take 
to traverse the portion of curve between A and B is given by the 
integral 


zı dt ds “1 VI +y? 
( ) is) xO ds dx x z0 u(x, J, y’) 


The actual path of the light is determined by the function y = g(x) 
for which this integral has the least possible value. 


We see that the optical problem of finding the light ray is a special 
case of the general problem stated above, corresponding to 


_vl+y? 
E U 


F 


In most optical cases the velocity of light v is independent of the 
direction and is merely a function of position v(x, y). 


7.2 Necessary Conditions for Extreme Values of a Functional 


a. Vanishing of the First Variation 


Our object is to find necessary conditions that a function y = g(x) 
may yield a maximum or minimum or, to use a general term, an ex- 
treme value, of the integral IZ{ġ} defined by (4). We proceed by a 
method quite analogous to that used in the elementary problem of 
finding the extreme values of a function of one or more variables. We 
assume that y = ¢ = u(x) is the solution. Then we have to express 
the fact that (for a minimum) J must increase when u is replaced by 
another admissible function ¢. Moreover, because we are merely 
concerned with obtaining necessary conditions, we may confine our- 
selves to the consideration of any special class of functions ¢ that 
are close to u, that is, functions for which the absolute value of the 
difference ¢ — u remains between prescribed bounds. 

We think of the function u as a member of a one-parameter family 
with parameter e£, constructed as follows: We take any function n(x) 
that vanishes on the boundary of the interval—that is, for which 
n(xo) = 0, n(x1) = O—and that has continuous first and second 
derivatives everywhere in the closed interval. We then form the 
family of functions 


g(x, €) = u(x) + en(x). 
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The expression en(x) = õu is called a variation of the function u. 
[since n(x) = d¢/de, the symbol 5 denotes the differential obtained 
when £ is regarded as the independent variable and x as a parameter.] 
Then, if we regard the function u as well as the function ņ as fixed, 
the value of the functional 


I{u + en} = Ge) = J. F(x, u + en, u’ + en’) dx 


becomes a function of €; and the postulate that u shall give a minimum 
of I {d} implies that the function above shall possess a minimum for 
e = 0, so that as necessary conditions we have the equation 


(6a) G'(0) = 0 
and also the inequality 
(6b) G” (0) = 0. 


The corresponding necessary conditions for a maximum are the 
same equation G’(0)=0 and the reversed inequality G’(0) < 0. 
The condition G’(0) = 0 must be satisfied for every function n that 
satisfies the above conditions but is otherwise arbitrary. 

Putting aside the question of discriminating between maxima and 
minima, we say that if a function u satisfies the equation G’(0) = 0, 
for all functions n, the integral J is stationary for ¢ = u. If, as before, 
we use the symbol ò to denote differentiation with respect to £, we 
also say that the equation 


ôI = eG’(0) = 0, 


when satisfied by a function ¢ = u and arbitrary n, expresses the 
stationary character of I. The expression 


/ — d z1 / , 
(6c) eG'(0) = £ | $ f F(x, u + en, u’ + en’) dx| 


&=0 

is called the variation or, more accurately, the first variation,! of the 
integral. Stationary character of an integral and vanishing of the first 
variation, therefore, mean exactly the same thing. 


1From this comes the use of the term calculus of variations, which is meant to indicate 
that in this subject we are concerned with the behavior of functions of a function 
when this independent function, or argument function, is made to vary by altering a 
parameter £. 
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Stationary character is necessary for the occurrence of maxima or 
minima, but as in the case of ordinary maxima or minima, itis nota 
sufficient condition for the occurrence of either of these possibilities. 
We shall not treat the problem of sufficiency here; in what follows, 
we confine ourselves to the problem of stationary character. 

Our main object is to transform the condition G’(0) = 0 for the 
stationary character of the integral in such a way that it becomes a 
condition for u only and no longer contains the arbitrary function n. 


Exercises 7.2a 
1. In connection with the brachistochrone problem (see pp. 737-738), cal- 
culate the time of fall when the points A and B are joined by a straight 


line. 


2. Let the velocity of a particle with spherical coordinates (r, 9, ¢) moving 
in three-dimensional space be v = 1/f(r). What time does the particle take 
to describe the portion of a curve given by a parameter o [the coordinates 
of a point on the curve being r(c), 9(c), ¢(c)] between the points A and B? 


b. Derivation of Euler’s Differential Equation 


The fundamental criterion of the calculus of variations is con- 
stituted by the following theorem: 


Necessary and sufficient for the integral 


(7a) Ig} = f Fe, g, g) dx 


to be stationary when ¢ = u is that u shall be an admissible function 
satisfying Euler’s differential equation 


(Tb) Llu] = Fu — © Fw =0, 


or, in full, 
(7c) Furwiu” + Puy’ + Fey — Fu = 0. 
To prove this we note that we can differentiate the expression 


G(e) = J. F(x, u + en, u' + en’) dx 


with respect to £ under the integral sign (cf. p. 74), provided that 
the differentiation yields a function of x that is continuous or at least 
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sectionally continuous. In this case, on putting u + en = y and dif- 
ferentiating, we obtain under the integral sign the expression nFy + 
n’ Fy’, which, owing to the assumptions made about f, u, and n, satis- 
fies the conditions just stated. Hence, we immediately obtain 


OD GO) =f" ME, u w) +N Fula, u, w) da. 


For subsequent purposes, we note that in deriving this equation 
we have used nothing beyond the continuity of the functions u and 
7 and the sectional continuity of their first derivatives. In this 
equation the arbitrary function appears under the integral sign in a 
twofold form, namely, as n and n’. We can, however, immediately get 
rid of n’ by integration by parts; we have 


w fa” (gr) dx=— f'n (= Fw] dx, 


for by hypotheses n(xo) and n(x1) vanish. In this integration by parts 
we have to assume that the expression (d/dx)Fy is defined and in- 
tegrable, but this is certainly the case since we assumed continuity 
of the second derivatives of F. Hence, if we write 


f” n Fu dx = q Fu 
x0 


_ d 
(Te) Llu] = Fu — 5 Fw 


for brevity, we have the equation 
(T£) f 1 nL [u] dx = 0. 
x0 


This equation must be satisfied for every function n that satisfies our 
conditions but is otherwise arbitrary. From this, we conclude that 


(7g) L{u] = 0, 


by virtue of the following: 


LEMMA I. Jf a function C(x) that is continuous in the interval under 
consideration satisfies the relation 


J. n(x) C(x) dx = 0 


for an arbitrary function n(x) such that n(xo) = n(x1) = 0 and 1n’'(x) 
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is continuous, then C(x) = 0 for every value of x in the interval. (The 
proof of this lemma will be postponed to p. 747.) 

We could, however, obtain condition (7g) in a different way,’ by 
getting rid of the term in 7 in the quation 


[2 (Fut W Fu) de =0 


by integration by parts, for if we write Fw = A, Fu = b = B for 
brevity and remember the boundary condition for n, on integrating 
by parts we obtain 


Tı _ Ti , —_ zı f 
fio nFuds=f nB dx = Ja n'B dx. 
If we put © = n’, we have, in analogy to (7f), the condition 
ay 
(7h) f (A — B) dx = 0. 
zo 


In deriving this formula we need not make any assumptions about 
the second derivatives of n and u. On the contrary, it is sufficient to 
assume that ¢ (or u and n) are continuous and have sectionally con- 
tinuous first derivatives. Now equation (7h) must hold, not, it is true, 
for any arbitrary (sectionally continuous) function ¢ but only for 
those functions 6 that are derivatives of a function n(x) satisfying our 
conditions at the end points. However, if €(x) is any given sectionally 
continuous function satisfying the relation 


(7i) J, S@) ax = 0, 
we can put 
n = | CW dit; 


we have then constructed an admissible n, for n’ = 6 and n(xo) = 
n(xı) = 0. We thus obtain the following result: 


A necessary condition that the integral should be stationary is 
(73) fJ? A — B) dx = 0, 
£0 


1The first method is Lagrange’s, and the second, P. Du Bois Reymond’s. 
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where © is an arbitrary sectionally continuous function merely satisfy- 
ing the condition (Ti). 
We now require the help of the following: 


LEMMA II. If a sectionally continuous function S(x) satisfies the 
condition 


(8a) J. CS dx = 0, 


for all functions 6(x) that are sectionally continuous in the interval 
and for which 


Tı _ 
(8b) J _ Sdx=0, 
then S(x) is a constant c. 


This lemma will also be proved below on p. 747. If meanwhile we 
assume its truth, it follows from (7h)—if we substitute the above ex- 
pressions for A and B—that 


fo Fu dx+c= Fy. 
£0 


Since F, is sectionally continuous, the left side regarded as an in- 
definite integral may be differentiated with respect to x and has Fu 
as its derivative; the same is therefore true of the right side. Hence, 
the expression (d/dx) Fu: for the supposed solution u exists, and the 
equation 


_ a 
(9a) Fu = dx 


F u’ 
holds at all points of continuity of w’. 

Thus, Euler’s equation remains the necessary condition for an 
extreme value, or the condition that the integral should be stationary, 
when the class of admissible functions ¢(x) is extended from the 
outset by requiring only sectional continuity of the first derivative 
of d(x). 

Euler’s equation is an ordinary differential equation of the second 
order. Its solutions are called the extremals of the minimum problem. 
To solve the minimum problem, we must find among all the extremals 
that one that satisfies the prescribed boundary conditions. 


Calculus of Variations 747 


If Legendre’s condition 
(9b) Pury a 0 


is satisfied for ¢ = u(x), the differential equation can be brought 
into the “regular” form u” = f(x, u, wu’), where the right side is a 
known expression involving x, u, u’. 


c. Proofs of the Fundamental Lemmas 


We now prove the two lemmas used above. To prove Lemma I, we 
assume that at some point, say x = &, C(x) is not zero and is positive. 
Then, since C(x) is continuous, we can certainly mark off a subinter- 
val of (xo, x1), 


(9c) E-axxx6+4, 


within which C(x) remains positive. We now choose a twice con- 
tinuously differentiable n, positive in the interior of this subinterval 
and zero elsewhere, say, by setting for x in (9c) 


N(x) = (x — E + a)! (x —E—a)t= {(x — E} — a} 4, 


This function ņ certainly fulfills all the prescribed conditions; n(x)C(x) 
is positive inside the subinterval and zero outside it. The integral 


J. nC dx 


therefore cannot be zero.! Since this contradicts our hypothesis, C(é) 
cannot be positive. For the same reasons, C(E) cannot be negative. 
Hence, C(€) must vanish for all values of & within the interval, as 
was stated in the lemma. 

To prove Lemma II, we note that our assumption (8b) about C(x) 
immediately leads to the relation 


(10) Ji, EO (S@) - à dx = 0, 


where c is an arbitrary constant. We now choose c in such a way that 
S(x) — cis an admissible function C(x); that is, we determine c by t the 
equation 


1 The integral of a continuous nonnegative function is positive except when the 
integrand vanishes everywhere; this follows immediately from the definition of in- 
tegral. 
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o={ 'C dx = | t (S) — c} dx = | ! S(x) dx — c(xı — xo). 
x0 T0 T0 


Substituting this value of c in equation (10) and taking C= S(x) — c, 
we at once have 


fa {S(x) — c}? dx = 0. 


Since by hypothesis the integrand is continuous, or at least sectional- 
ly continuous, it follows that 


S(x)—c=0 


is an identity in x, as was stated in the lemma. 


d. Solution of Euler’s Differential Equation in Special Cases. 
Examples. 


To find the solutions u of the minimum problem, we must find a 
particular solution of Euler’s differential equation for the interval 
xo < x < xı that assumes the prescribed boundary values yo and yı 
at the end points. Since the complete integral of Euler’s differential 
equation of the second order contains two constants of integration, 
we expect to determine a unique solution by making these two con- 
stants fit the boundary conditions, the latter giving two equations 
that the constants of integration must satisfy. 

In general, it is not possible to solve Euler’s differential equation 
explicitly in terms of elementary functions or quadratures, and we 
have to be content to show that the variational problem does reduce to 
a problem in differential equations. On the other hand, for important 
special cases and, in fact, for most of the classical examples, the 
equation can be solved by means of quadratures. 

The first case is that in which F does not contain the derivative 
y' = @ explicitly: F = F(@, x). Here Euler’s differential equation 
is simply Fu(u, x) = 0; that is, it is no longer a differential equation 
at all but forms an implicit definition of the solution y = u(x). Here, 
of course, there is no question of integration constants or the pos- 
sibility of satisfying boundary conditions. 

The second important special case is that in which F does not 
contain the function y = ¢(x) explicitly: F = F(y’, x). Here Euler’s 
differential equation is (d/dx) (Fu) = 0, which at once gives 


Fu = C, 
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where c is an arbitrary constant of integration. We may use this 
equation to express uw’ as a function f(x, c) of x and c, and we then 
have the equation 


u' = f(x, c), 


from which by a simple integration (quadrature) we obtain 
z 
= f fŒ, 0) dé + a; 


that is, u is expressed as a function of x and c, together with an ad- 
ditional arbitrary constant of integration a. In this case, therefore, 
Euler’s differential equation can be completely solved by quadrature. 

The third case, which is the most important in examples and 
applications, is that in which F does not contain the independent 
variable x explicitly: F = F(y, y’). In this case, we have the following 
important theorem: 


If the independent variable x does not occur explicitly in the varia- 
tional problem, then 


(11) E = F(u, u’) — u' Fu(u, u’) = c 


is an integral of Euler’s differential equation. That is, if we substitute 
in this expression a solution u(x) of Euler’s differential equation for 
F, the expression becomes a constant independent of x. 

The truth of this statement follows at once if we form the derivative 
dE/dx. We have 


a = Fuu’ + Fuu” — u” Fw —u? Fuu’ — wu" Furw, 
or by (7c) 
dE , 
dx 7 L[u] = 0; 


hence, for every solution u of Euler’s differential equation, we have 
E = c, where c is a constant. 
If we think of wu’ as calculated from the equation E = c, say w = 
f(u, c), a simple quadrature applied to the equation 
dx 1 


du f(u, c) 
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gives x = g(u, c) + a (where a is another constant of integration); 
that is, x is expressed as a function of u, c, and a. By solving for u, 
we then obtain the function u(x, c, a). Hence, the general solution 
of Euler’s differential equation, depending on two arbitrary constants 
of integration, is obtained by a quadrature. 

We shall now use these methods to discuss a number of examples. 


General Note 


There 1s a general class of examples in which F is of the form 
F = gy) v1 + y”, 


where g(y) is a function depending explicitly on y only. For the 
extremals y = u, our last rule gives at once 


m u 12 
glu) v1 + u? — oar = 
or 
glu) 
i+ ue © 
whence, 
dx 1 


du~ vV({g(u)}?/c2) — 1’ 


and on integrating we have the equation 


du 
a2 -05 RCO Lost 


where b is another constant of integration. By evaluating the integral 
on the right and solving the equation for u, we obtain u as a function 
of x and of the two constants of integration c and b.t 


The Surface of Revolution of Least Area 
In this case, by (2b), p. 738, g = y. The integral (11) becomes 


b = du 
TO J Vee? — 1 


u 
= car cosh ~~; 


1 Of course, we may not be able to solve for u in terms of elementary functions, but for 
all practical purposes, these procedures define u well enough. 
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hence, the result is 


x= 
y = u = c cosh 


That is, the solution of the problem of finding a curve that on rotation 
gives a surface of revolution with stationary area is a catenary (see 
Volume I, p. 378). 

A necessary condition for the occurrence of such a stationary curve 
is that the two given points A and B can be joined by a catenary for 
which y > 0. The question whether the catenary really represents a 
minimum will not be discussed here. 


The Brachistochrone 


Another example is obtained by taking g = 1/vy. This, according to 
(2a), p. 738, is the problem of the brachistochrone. By means of the 
substitutions 1/c? = k, u = kt, t = sin?0/2, the integral (12) 


f u 
V1/(uc?) — 1 


is immediately transformed into 


z-b=k| J 


7d = = ake — cos 8) dé, 


whence 
1 
x— b= z RO — sin 9), 
1 
y=zu= 5 k(1 — cos 8). 


The brachistochrone is accordingly (cf. Volume I, p. 329) a common 
cycloid with its cusps on the x-axis. 


Exercises 7.2d 


1. Find the extremals for the following integrands: 
(a) F=vy + y?) 
(b) F= V1 + y2/y 
(c) F=yv1—y? 
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2. Find the extremals for the integrand F = x" y’2, and prove that if n = 1, 
two points lying on opposite sides of the y-axis cannot be joined by an 
extremal. 

3. Find the extremals for the integrand y”y'm, where n and m are even inte- 
gers. 


4. Find the extremals for the integrand F = ay’? + 2byy’ + cy2, where a, 
b, c are given continuously differentiable functions of x. Prove that Eu- 
ler’s differential equation is a linear differential equation of the second 
order. Why is it that when b is constant, this constant does not enter into 
the differential equation at all? 


5. Show that the extremals for the integrand F = e V1 + y’2 are given by 
the equations sin(y — b) = e~@-® and y = b, where a, b are constants. 
Discuss the form of these curves, and investigate how the two points 
A and B must be situated if they can be joined by an extremal arc of the 
form y = f(x). 

6. For the case where F' does not contain the derivative y’, deduce Euler’s 
condition Fy = 0 by an elementary method. 

7. Find a function giving the absolute minimum of 


1 
Ii} = |, y2 dx 

with the boundary conditions 

(a) y0) = y1) = 0 

(b) y0) = 0, y1) = 1. 


8. Find the extremals for f vr? + r’2 d9, that is, the paths of shortest distance 
in polar coordinates. 


e. Identical Vanishing of Euler’s Expression 


Euler’s differential equation (7c), p. 743 for F(x,y,y’) may degenerate 
into an identity that tells us nothing, that is, into a relation that is 
satisfied by every admissible function y = g(x). In other words, 
the corresponding integral may be stationary for any admissible 
function y = ¢(x). If this degenerate case is to occur, Euler’s ex- 
pression 


Fy — Fay — Fyyy! — Fyryy" 


must vanish at every point x of the interval, no matter what function 
y = ¢(x) is substituted in it. We can, however, always find a curve 
for which y = ø, y'= ¢', and y” = ¢” have arbitrary prescribed 
values for a prescribed value of x. Euler’s expression must therefore 
vanish for every quadruple of numbers x, y, y’, y”. We conclude that 
the coefficient of y”, (i.e., Fy’) must vanish identically. F must 
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therefore be a linear function of y’, say F = ay’ + b, where a and 
b are functions of x and y only. If we substitute this in the remaining 
part of the differential equation, 


Pyyy’ + Fry — Fy = 0, 
it follows at once that 
0 = ayy’ + dz — ayy’ — by 
or that 
Az — by 


must vanish identically in x and y. In other words, Euler’s expression 
vanishes identically if, and only if, the integral is of the form 


I= | {a(x, y) y + B(x, »)} dx = fa dy + bdx, 


where a and b satisfy the condition of integrability that we have 
already met with on p. 104, that 1s, where a dy + b dx is a exact 
differential. 


7.3 Generalizations 


a. Integrals with More Than One Argument Function 


The problem of finding the extreme values (stationary values) of 
an integral can be extended to the case where this integral depends 
not on a single argument function but on a number of such functions 
gı(x), g2(x), . e 89 bn(x). 

The typical problem of this type may be formulated as follows: 
Let F (x, ¢1, . . ., dn, $1’, . . ., dn’) be a function of the (2n + 1) argu- 
ments x, ¢1, . . ., Øn, which is continuous and has continuous deriv- 
atives up to, and including, the second order in the region under 
consideration. If we replace y; = ¢; by a function of x with continuous 
first and second derivatives, and ¢;’ by its derivative, F becomes a 
function of the single variable x, and the integral 


(13) I {gı 8 -, Pn} =| F(x, ol, . -> Dn, g1’, e e o9 én) dx 


over a given interval xo < x < xı has a definite value determined by 
the choice of these functions. 
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In the comparison with the extreme value, we regard as admissible 
all functions ¢;(x) that satisfy the above continuity conditions and 
for which the boundary values ¢:(xo) and ¢i(x1) have prescribed 
fixed values. In other words, we consider the curves y; = di(x) 
joining two given points A and B in (n + 1)-dimensional space with 
coordinates yı, y2, . . ., Yn, x. The variational problem now requires us 
to find, among all these systems of functions ¢;(x), one [yi = ¢:(x) 
= ui(x)] for which the integral (13) has an extreme value (a maximum 
or a minimum). 

Again, we shall not discuss the actual nature of the extreme value 
but shall confine ourselves to inquiring for what systems of argument 
functions ¢i(x) = u(x) the integral is stationary. 

We define the concept of stationary value in exactly the same way 
as we did on p. 742. We embed the system of functions u(x) in a 
one-parameter family of functions depending on the parameter e£, in 
the following way: Let 11(x),..., Nn(x) be n arbitrarily chosen 
functions that vanish for x = xo and x = x1, are continuous in the 
interval, and possess continuous first and second derivatives there. 
We embed the uwi(x) in the family of functions yi = ¢:(x) = ui(x) + 


eni(x). 
The term eni(x) = õu: is called the variation of the function wi. 
If we substitute the expressions for ¢; in I {¢i, . . ., dn}, this integral 


is transformed into 
T1 / / ; / 
G(s) =f F(x, ui + €ni, . . ., Un + ENn, Ui’ + ENI’, . . ., Un’ + ENn’) dx, 


which is a function of the parameter €. A necessary condition that 
there may be an extreme value when ¢; = wu (i.e., when £ = 0) is 


G'(0) = 0. 


Exactly as for the case of one independent function, we say that the 
integral J has a stationary value for ¢; = u if the equation G‘(0) = 0 
holds or 


$I = G0) = 0 


holds, no matter how the functions ni are chosen subject to the 
conditions stated above. In other words, stationary character of the 
integral for a fixed system of functions u(x) and vanishing of the first 
variation ôl mean the same thing. 
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We have still the problem of setting up conditions for the stationary 
character of the integral that do not involve the arbitrary variations 
nı. This requires no new ideas. We proceed as follows: First we take 
Ne, N3, . - -, Nn as identically zero (i.e., we do not let the functions 
uz, . . . , Un vary). We thus consider only the first function ¢1(x) as 
variable and then the condition G’(0) = 0, by p. 744, is equivalent to 
Euler’s differential equation 


d 
Fu: — da Ter = 0. 


Since we can pick out any one of the functions w(x) in the same way, 
we obtain the following result: 


A necessary and sufficient condition that the integral (13) may be 
stationary is that the n functions ui(x) shall satisfy the system of Euler’s 
equations 


(13a) Fu; — 2 Fu = 0 GG=1,2,...,n). 
dx 


This is a system of n differential equations of the second order 
for the n functions wu (x). All solutions of this system of differential 
equations are said to be extremals of the variational problem. Thus, 
the problem of finding stationary values of the integral reduces to the 
problem of solving these differential equations and adapting the 
general solution to the given boundary conditions. 


6. Examples 


The possibility of giving a general solution of the system of Euler’s 
differential equations is even more remote than in the case in Section 
7.2. Only in very special cases can we find all the extremals explicitly. 
Here the following theorem, analogous to the particular case of formu- 
la (11) on p. 749, is often useful: 


1Using Lemma II (Section 7. 2, p. 746), we can prove that these differential equations 
must hold under the general assumption that the admissible functions merely have 
sectionally continuous first derivatives. However, if we wish to concentrate on the 
formalism of the subject, it is more convenient to include continuity of the second 
derivatives in the conditions of admissibility of the functions ¢:(x). We can then 
write out the expressions d/dx Fy,’ in the form 


n n 
(13b) l po Forutik” + po Frene + Fzru;'. 
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If the function F does not contain the independent variable x explicit- 
ly, i.e. F = F(gi, . . ., bn, $1, . . ., bn’), then the expression 


n 
E = F(u, . . ., Un, Ur, . . ., Un’) — 2 Ui! Fu; 
£ 


is an integral of Euler’s system of differential equations. That is, if we 
consider any system of solutions u;(x) of Euler’s equations (13a), we 
have 


(13c) E = F — >) u; Fu; = constant = c, 


where, of course, the value of this constant depends upon the system 
of solutions substituted. 

The proof follows the same lines as on p. 749; we differentiate the 
left side of our expression with respect to x and, using (13b), verify 
that the result is zero. 

A trivial example is the problem of finding the shortest distance 
between two points in three-dimensional space. Here we have to 
determine two functions y = y(x), z = 2(x) such that the integral 


T1 
j V1 + y? + z? dx 
x0 


has the least possible value, the values of y(x) and 2(x) at the end 
points of the interval being prescribed. Euler’s differential equations 
(18a) give 


d y d z’ 


dx Vit y?+22 dx Vi + y?y a O 


whence it follows at once that the derivatives y’(x) and 2’(x) are 
constant; hence, the extremals must be straight lines. 

Somewhat less trivial is the problem of the brachistochrone in three 
dimensions. (Gravity is again taken as acting along the positive 
y-axis.) Here we have to determine y = y(x), z = z(x)in such a way that 
the integral 


x 14 v2 4 5/2 tı 

f } jee dx =Í Fly, y', 2’) dx 

x0 y x0 

is stationary. One of Euler’s differential equations gives 
z 1 


vy vity?te2 S 
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1 1 
By By — 2 Fz Vy vit y?2+ 2? b, 


where a and b are constants. By division it follows that z’ = a/b = k 
is likewise constant. The curve for which the integral is stationary 
must therefore lie in a plane z = kx + h. From the further equation 


11, 
Vy V1+kR?+y2 ” 

there follows, as is obvious from p. 751, that this curve must again 
be a cycloid. 


Exercises 7.3b 


1. Write down the differential equations for the path of a ray of light in 
three dimensions in the case where (spherical coordinates r, 9, ¢ being 
used) the velocity of light is a function of r (cf. Exercise 2, p. 743). Show 
that the rays are plane curves. 


2. Show that the geodesics (curves of shortest length joining two points) 
on a sphere are great circles. 


3. Find the geodesics on a right circular cone. 


4. Show that the path minimizing the distance between two nonintersect- 
ing smooth closed curves is their common normal line. 

5. Show that the path for the least time of fall from a given point to a given 
curve is the cycloid that meets the curve perpendicularly. 


6. Prove that the extremals of fF (x, y) VI + y? dx, with end points freely 
movable on two curves, meet those curves orthogonally. 


c. Hamilton’s Principle. Lagrange’s Equations 


Euler’s system of differential equations has a very important bear- 
ing on many branches of applied mathematics, especially dynamics. 
In particular, the motion of a mechanical system consisting of a finite 
number of particles can be expressed by the condition that a certain 
expression, the so-called Hamilton’s integral, is stationary. Here we 
shall briefly explain this connection. 

A mechanical system has n degrees of freedom if its position 1s 
determined by n independent coordinates qi, g2,..., Qn. If, for 
example, the system consists of a single particle, we have n = 3, since 
for qi, q2, q3 we can take the three rectangular coordinates or the 
three spherical coordinates. Again, if the system consists of two 
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particles held at unit distance apart by a rigid connection—assumed 
to have no mass—then n = 5, since for the coordinates gi we can 
take the three rectangular coordinates of one particle and two other 
coordinates determining the direction of the line joining the two 
particles. 

A dynamical system can be described with sufficient generality by 
means of two functions, the kinetic energy and the potential energy. 
If the system is in motion, the coordinates q; will be functions qi(¢) 
of the time t, the components of velocity being qi = dqi/dt. The kinetic 
energy associated with the dynamical system is a function of the 
form 


(14a) T(qi, . - ., an, Gi,» . «5 Gn) = 21 Sandie (Oir = Oxi). 
itt 


The kinetic energy, therefore, is a homogeneous quadratic expression 
in the components of velocity, the coefficients ai; being taken as 
known functions, not depending explicitly on the time, of the co- 
ordinates qi, . . ., qn themselves.1 

In addition to the kinetic energy, the dynamical system is supposed 
to be characterized by another function, the potential energy 
U(qi, . . ., qn), which depends on the coordinates of position qi only 
and not on the velocities or the time.? 

Hamilton’s principle states that the motion of a dynamical system 
in the interval of time to < t < tı from a given initial position to a given 
final position is such that for this motion the integral 


(14b) Hian.. san} = f S (T — U) dt 


is stationary, in the class of all continuous functions qi(t) that have 
continuous derivatives up to, and including, the second order and that 
have the prescribed boundary values for t = to and t = tı 


1We obtain this expression for the kinetic energy T by thinking of the individual 
rectangular coordinates of the particles of the system as expressed as functions of the 
coordinates qi. . . ., dn. Then the rectangular velocity components of the individual 
particles can be expressed as linear homogeneous functions of the qi's; from these we 
form the elementary expression for the kinetic energy, namely, half the sum of the 
products of the individual masses and the squares of the corresponding velocities. 
2We restrict ourselves here to mechanical systems in which the forces acting are con- 
servative and independent of time. As is shown in dynamical textbooks, the potential 
energy determines the external forces acting on the system (see p. 0000 for the case 
of a single particle). In bringing the system from one position into another, me- 
chanical work is done; this is equal to the difference between the corresponding 
values U and does not depend on the particular motion from one position to another. 
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This principle of Hamilton’s is a fundamental principle of dy- 
namics. It contains in condensed form the laws of dynamics. When 
applied to Hamilton’s principle, the Euler equations (13a), give 
Lagrange’s equations, 


(14c) d oF _ oF _ _ 3U @=1,2,...,n), 


which are the fundamental equations of theoretical dynamics. 

Here we shall only make one noteworthy deduction, namely, the 
law of conservation of energy. 

Since the integrand in Hamilton’s integral does not depend explicit- 
ly on the independent variable t, for the solution qi(t) of the differ- 
ential equations of dynamics the expression 


.arT—U 
T-U-S 4 


must be constant [see (13c) ]. Since U does not depend on the qi and 
T is a homogeneous quadratic function in them (cf. p. 119), 


a 2 


2 Gi =>) ġi 2T. 


Hence 
T + U = constant; 


that is, during the motion the sum of the kinetic energy and the potential 
energy does not vary with time. 


d. Integrals Involving Higher Derivatives 


Analogous methods can be used to attack the problem of the ex- 
treme values of integrals in which the integrand F not only contains 
the required function y = ¢ and its derivative ¢’ but also involves 
higher derivatives. For example, suppose we wish to find the extreme 
values of an integral of the form 


(15a) Ig} = f, Fæ g, g, g") dx, 


where in the comparison those functions y = ¢(x) are admissible that, 
together with their first derivatives, have prescribed values at the end 
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points of the interval and that have continuous derivatives up to, 
and including, the fourth order. 

To find necessary conditions for an extreme value, we again assume 
that y = u(x) 1s the desired function. We embed u(x) in a family of 
functions y = ¢(x) = u(x) + en(x), where £ is an arbitrary parameter 
and n(x) an arbitrarily chosen function with continuous derivatives 
up to, and including, the fourth order that vanishes together with its 
first derivatives at the end points. The integral then takes the form 
G(s), and the necessary condition 


(15b) G’(0) = 0 


must be satisfied for all choices of the function n(x). Proceeding in a 
way analogous to that on p. 744, we differentiate under the integral 
sign and thus obtain the above condition in the form 


T 
(15c) fa (nFu + 1 Fur +” Fun) dx = 0, 


which must be satisfied if u is substituted for d(x). Integrating once 
by parts, we reduce the term in 1’(x) to one in n, and integrating twice 
by parts, we reduce the term in n”(x) to one in n; taking the boundary 
conditions into account, we easily obtain 


(15d) SenF — a Fy + -< Fun) dx = 0. 


Hence, the necessary condition for an extreme value (i.e., that the 
integral may be stationary) is Euler’s differential equation 


2 
(15e) Liu] = Fu — d Py + 73 


d _ 
dx dx? Fur = 0. 


The reader can verify for himself that this is a differential equation 
of the fourth order.! 


e. Several Independent Variables 


The general method for finding necessary conditions for an extreme 
value can equally well be applied when the integral is no longer a 
simple integral but a multiple integral. Let D be a given region 
1In deriving (15e) from (15d) we have to restrict n in Lemma I (p. 744) to functions of 


class C4 for which n and n’ vanish at the end points. It is clear from the proof of the 
lemma on p. 747 that the conclusion is valid under these more restrictive conditions. 
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bounded by a curve F in the x, y-plane. We assume that D and T are 
sufficiently regular to permit application of the rule for integration by 
parts (p. 557). Let F(x, y, 6, dz, dy) be a function that is continuous and 
twice continuously differentiable with respect to all five of its argu- 
ments. If in F we substitute for ¢ a function g(x, y) that has continu- 
ous derivatives up to, and including, the second order in the region D 
and has prescribed boundary values on IT and if we replace øz and ¢y 
by the partial derivatives of ¢, F becomes a function of x and y, and 
the integral 


(16a) Ii} = || F(x, y, 9, Ga, by) dx dy 


has a value depending on the choice of ¢. The problem is that of find- 
ing a function ¢ = u(x, y) for which this value is an extreme value. 

To find necessary conditions we again use the old method. We 
choose a function n(x, y) that vanishes on the boundary T; has con- 
tinuous derivatives up to, and including, the second order; and is 
otherwise arbitrary. We assume that u is the required function and 
then substitute ¢ = u + en in the integral, where £ is an arbitrary 
parameter. The integral again becomes a function G(s), and a neces- 
sary condition for an extreme value is 


G'(0) = 0. 


As before, this condition takes the form 
(16b) Í IR (Fu + nz Fuz + NyFuy) dx dy = 0. 


To get rid of the terms in nz and ny under the integral sign we integrate 
one term by parts with respect to x and the other with respect to y. 
Since ņ vanishes on T, the boundary values on T fall out, and we have 


a 
(16c) SfafFu- so Fus — z Fus dx dy = 0. 


Lemma I (p. 744) can be extended at once to more dimensions than 
one, and we immediately obtain Euler’s partial differential equation 
of the second order, 


0 0 
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Examples 


1. F = dz? + },y?. If we omit the factor 2, Euler’s differential equation 
becomes 


Au = Uzz + Uyy = 0. 


That is, Laplace’s equation has been obtained from a variation 
problem. 

2. Minimal surfaces. Plateau’s problem is this: To find, over a 
region D, a surface z = f(x, y) that passes through a prescribed curve 
in space whose projection is IT and whose area 


f|, VIF Gt + Oe dx dy 


is a minimum. 
Here Euler’s differential equation is 


oe ll ly tg 
dx y1 + Uz? + uy?’ oy V1 + Ug? + Uy" 


or, in expanded form, 


Uxr(1 + Uy?) — 2UzryUzUy + Uyy(1 + uz?) = 0. 


This is the celebrated differential equation of minimal surfaces, which 
we have treated extensively elsewhere. 


7.4 Problems Involving Subsidiary Conditions. Lagrange 
Multipliers 


In discussing ordinary extreme values for functions of several 
variables in Chapter 3 (p. 332) we considered the case where these 
variables are subject to certain subsidiary conditions. In this case 
the method of undetermined multipliers led to a particularly clear 
expression for the conditions that the function may have a stationary 
value. An analogous method is even more important in the calculus 
of variations. Here we shall briefly discuss only the simplest cases. 


a. Ordinary Subsidiary Conditions 


A typical case is that of finding a curve x = x(t), y = y(t), z = 2(0), 
where fo < t < tı, in three-dimensional space, expressed in terms of 


1R. Courant, Dirichlet’s Principle, Conformal Mapping and Minimal Surfaces, 
Interscience: New York, 1950. 
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the parameter t, subject to the subsidiary condition that the curve 
shall lie on a given surface G(x, y, z) = 0 and shall pass through two 
given points A and B on that surface. The problem is then to make 
an integral of the form 


(17) J.) Fyz #5, 2) dt 


stationary by suitable choice of the functions x(t), y(t), z(t), subject to 
the subsidiary condition G(x, y, z) = 0 and the usual boundary and 
continuity conditions. 

This problem can be immediately reduced to the cases discussed on 
p. 753. We assume that x(t), y(t), z(t) are the required functions. We 
assume further that on the portion of surface on which the required 
curve is to lie z can be expressed in the form z = g(x, y); thisis certainly 
possible if Gz differs from zero on this portion of the surface. If we 
assume that on the surface in question the three equations Gz = 0, 
Gy = 0, Gz = 0 are not simultaneously true and if we confine our- 
selves to a sufficiently small portion of surface, we can suppose with- 
out loss of generality that G: + 0. Substituting z = g(x, y) and z = 
22x + gyy under the integral sign, we obtain a problem in which x(t) 
and y(t) are functions independent of one another. Thus, we can 
immediately apply the results of p. 755 and write down the con- 
ditions that the integral I may be stationary, by applying equations 
(18a) to the integrand 


F(x, J, g(x, y), x, y, XEx + VEy) = A(x, J, Xx, ý). 


We then have the two equations 


d — d . — d e — —_— Oz — 
att — H: = Gis Br + G; (F282) Figz Fz 5% = 9, 
d odp C OZ 
ai tly — Hy = G, Fo — Fu + g Pisa) — Figy — Fag = 0. 


But 


d,_%@ d _% 
dt®*~ ax’ dt®” ~ ay’ 


as we see at once on differentiation. Hence, 


d 


S Fi — Fe + ge (S; F: — Fi) = 0, 


dt 
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d d 
GF- Fu + gy (Y Fe- F) =0. 


If, for brevity, we write 


(18a) 4 F; — F, = àG; 


with a suitable multiplier A(t) and use the relations (p. 229) gz = 
—Gz/Gz, gy = —G,/Gz, we obtain the two further equations 


d 
(18b) ae? — Fr = AGr, 
d 


We thus have the following condition that the integral may be 
stationary: If we assume that Gz, Gy, Gz do not all vanish simultane- 
ously on the surface G = 0, the necessary condition for an extreme 
value is the existence of a multiplier à (t) such that the three equations 
(18a, b, c) are simultaneously satisfied in addition to the subsidiary 
condition G(x, y, z) = 0. That is, we have four symmetrical equations 
determining the functions x(t), y(t), z(t) and the multiplier i. 

The most important special case is the problem of finding the short- 
est line joining two points A and B on a given surface G = 0, on 
which it is assumed that the gradient of G does not vanish. Here 


F= JFE, 


and Euler’s differential equations are 


dX 
di Ve +527 2 MCs 
dy __ 
di Ja F pa OH 
d = NGe. 


dt /x? + 92 + 22 


These equations are invariant with respect to the introduction of a 
new parameter t. That is, as the reader may easily verify for himself, 
they retain the same form if t is replaced by any other parameter 
t = t(t), provided that the transformation is 1-1, reversible, and 
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continuously differentiable. If we take the arc length s as the new 
parameter, so that x? + ý? + z* = 1, our differential equations take 
the form 

d?x dy d?z 
(19) da ~ NGz, da ~ AGy, da = NGz. 

The geometrical meaning of these differential equations is that the 
principal normal vectors! of the extremals of our problem are orthog- 
onal to the surface G = 0. We call these curves geodesics of the 
surface. The shortest distance between two points on a surface, 
then, is necessarily given by an arc of a geodesic. 


Exercises 7.4a 


1. Show that the same geodesics are also obtained as the paths of a particle 
constrained to move on the given surface G = 0, subject to no external 
forces. In this case the potential energy U vanishes and the reader may 
apply Hamilton’s principle (p. 758). 

2. Let C be a curve on a given surface G(x, y, z) = 0. At each point of C 
take a perpendicular geodesic segment of fixed length and fixed orienta- 
tion relative to C. The free end of the geodesic segment generates a curve 
C’. Show that C’, too, is perpendicular to the geodesic segment. 


b. Other Types of Subsidiary Conditions 


In the problem discussed above we were able to eliminate the 
subsidiary condition by solving the equation determining the subsid- 
lary condition and thus reducing the problem directly to the type 
discussed previously. With the other kinds of subsidiary conditions 
that frequently occur, however, it is not possible to do this. The most 
important case of this type is the case of isoperimetric subsidiary 
conditions. The following is a typical example: With the previous 
boundary conditions and continuity conditions, the integral 


(20a) Ig) = J)" Fg, g) dx 


is to be made stationary, the argument function g(x) being subject to 
the further subsidiary condition 


(20b) H{¢} = f M G(x, ¢, ¢’) dx = a given constant c. 


1 That is, the vectors (X, }, Z); see p. 213. 


766 Introduction to Calculus and Analysis, Vol. II 


The particular case F = ø, G= v1 + ¢’7is the classical isoperimetric 
problem. 

This type of problem cannot be attacked by our previous method 
of forming the “varied” function ¢ = u + en by means of an arbitrary 
function n(x) vanishing on the boundary only, for in general, these 
functions do not satisfy the subsidiary condition in a neighborhood 
of £ = 0, except at £ = 0. We can attain the desired result, however, 
by a method similar to that used in the original problem, by in- 
troducing, instead of one function n and one parameter £, two 
functions nı(x) and n(x) that vanish on the boundary and two param- 
eters €1 and £2. Assuming that ¢ = u is the required function, we 
then form the varied function 


ý = u + £1N1 + EzN2. 


If we introduce this function into the two integrals, we reduce the 
problem to the derivation of a necessary condition for the stationary 
character of the integral 


T 
[= f ! F(x, u + €1n1 +E2N2, U + €1N1’ + eznz’) dx = K(e1,£2), 

0 
subject to the subsidiary condition 
H =f G(x, u + €1N1 + EzN2, W + E171 + eznz’) dx = M(eı, £2) = C; 
the function K(é1, £2) is to be stationary for £ı = 0, £2 = 0, where 
€1, £2 satisfy the subsidiary condition 

Mei, €2) = c. 


A simple discussion, based on the previous results for ordinary 
extreme values with subsidiary conditions, and in other respects 
following the same lines as the account given on p. 743, then leads 
to this result: 

Stationary character of the integral is equivalent to the existence of 
a constant multiplier à such that the equation H = c and Euler’s 
differential equation 


d 
dx (Pur + AGu) — (Fu + AGu) = 0 


are satisfied. An exception to this can only occur if the function u satis- 
fies the equation 
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© Gu- Gu = 0. 


The details of the proof may be left to the reader, who may consult 
the literature on this subject.! 


Exercises 7.4b 


1. Show that the geodesics on a cylinder are helices. 
2. Find Euler’s equations in the following cases: 


(a) F=V1+y? + yg(x) 
_ y"2 

(b) F = (1 + y'2)3 + yg(x) 

(c) F= y”? — y? + y? 

(d F= y1 +y? 


3. If there are two independent variables, find Euler’s equations in the 
following cases: 


(a) F = adz? + 2bgzby + chy? + od 
(b) F= (baz + yy)? = (4¢)? 
(c) F = (4¢)? + (dzabyy — bry”). 
4. Find Euler’s equations for the isoperimetric problem in which 
fe (au’2 + 2buw + cu?) dx 
is to be stationary subject to the condition 
Í “1 42 dx = 1, 
z0 
5. Let f(x) be a given function. The integral 
1 
I($) = f, Fœ) dx 
is to be made a maximum subject to the integral condition 
H($) = f, $ dx = K? 


where K is a given constant. 
(a) Find the solution u(x) from Euler’s equation. 


(b) Prove by applying Cauchy’s inequality that the solution found in (a) 
gives the absolute maximum for I. 


1See, for example, M. R. Hestenes, Calculus of Variations and Optimal Control Theory. 
John Wiley and Sons, New York, 1966. R. Courant and D. Hilbert: Methods of 
Mathematical Physics, Interscience Publishers, New York, 1953, Vol. I, Chapter IV. 
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6. 


7. 


Use the method of Lagrange’s multiplier to prove that the solution of 
the classical isoperimetric problem is a cricle. 


A thread of uniform density and given length is stretched between two 
points A and B. If gravity acts in the direction of the negative y-axis. the 
equilibrium position of the thread is that in which the center of gravity 
has the lowest possible position. It is accordingly a question of making 


an integral of the form fo yv1-+ y2 dx a minimum, subject to the sub- 


sidiary condition that so v1 + y’2 dx has a given constant value. Show 


that the thread will hang in a catenary. 


. Let y = u(x) yield the smallest value for the integral f FG, y, y) dx 


among all continuously differentiable functions y(x) with prescribed 
boundary values y(xo) = yo, y(x1) = yı. Prove that u(x) satisfies the in- 
equality Fyry(x , u(x), u’(x)) = 0 (Legendre’s condition) for all x in the 
interval xo <x < xı. 


. Let (xo, yo) and (x1, yı) be points lying above the x-axis. Find the extremals 


for the area under the graph of a function passing through the two points 
subject to the condition that the path between the two points has a fixed 
length. 


CHAPTER 
8 


Functions of a Complex Variable 


In Section 7.7 of Volume I we touched on the theory of functions 
of a complex variable and saw that this theory throws new light on 
the structure of functions of a real variable. Here we shall give a 
brief, but more systematic, account of the elements of that theory. 


8.1 Complex Functions Represented by Power Series 


a. Limits and Infinite Series with Complex Terms 


We start from the elementary concept of a complex number z= x+ iy 
(cf. Volume I, p. 104) formed from the imaginary unit i and any two 
real numbers x, y. We operate with these complex numbers just as we 
do with real numbers, with the additional rule that i? may always be 
replaced by -1. We represent x, the real part, and y, the imaginary part 
of z, by rectangular coordinates in an x, y-plane or a complex z-plane. 
The number z = x — iy is called the complex number conjugate to z. 
We introduce polar coordinates (r, 9) by means of the relations x = 
r cos 0, y = r sin 9 and call 0 the argument (or amplitude) of the 
complex number and 


r= Vx + y? = vVzz =|2| 
its absolute value (or modulus). We recall that 
|21 22] =|21| | 22]. 


We can immediately establish the so-called triangle inequality 
satisfied by the complex numbers 21, 22, and 21 + 22, 
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[21 + 22|S|21|4+ | zal, 
and the further inequality 
| wi] —]ue|S|ui — uel, 


which follows immediately from it if we put zı = wi — Ue, Z2 = u2. 

The triangle inequality may be interpreted geometrically if we 
represent the complex numbers 21, z2 by vectors in the x, y-plane 
with components xı, yı and x2, y2, respectively. The vector that rep- 
resents the sum zı + 22 is then simply obtained by vector addition 
of the first two vectors. The lengths of the sides of the triangle formed 
by this addition (see Fig. 8.1) are 


|zi|, |ze], |e. + 22]. 


21 + 22 
|22| 


2i 


|21] A 


Figure 8.1 The triangle inequality for complex numbers. 


Thus, the triangle inequality expresses the fact that any one side of 
a triangle is less than the sum of the other two. 

The essentially new concept that we now consider is that of the 
limit of a sequence of complex numbers. We state the following defini- 
tion: a sequence of complex numbers zn tends to a limit z provided 
|zn — z| tends to zero. This, of course, means that the real part and the 
imaginary part of zn — z both tend to zero. It follows that Cauchy’s 
test applies: the necessary and sufficient condition for the existence 
of a limit z of a sequence Zn is 


A particularly important class of limits arises from infinite series 
with complex terms. We say that the infinite series with complex 
terms, 
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>, Cv, 
v=0 


converges and has the sum S if the sequence of partial sums, 


tends to the limit S. If the real series with nonnegative terms 
>| ev| 
v=0 


converges, it follows, just as in Chapter 7 of Volume I (p. 514), that 
the original series with complex terms also converges. The latter 
series is then said to be absolutely convergent. 

If the terms cy of the series, instead of being constants, depend on 
(x, y), the coordinates of a point varying in a region R, the concept 
of uniform convergence acquires a meaning. The series is said to be 
uniformly convergent in R if for an arbitrarily small prescribed 
positive £ a fixed bound N can be found, depending on £ only, such 
that for every n = N the relation |Sn — S| < € holds, no matter 
where the point z = x + iy lies in the region R. Uniform convergence 
of a sequence of complex functions Sn(z) depending on the point z of 
R is, of course, defined in exactly the same way. All these relations and 
definitions and the associated proofs correspond exactly to those with 
which we are already familiar from the theory of real variables. 

The simplest example of a convergent series is the geometric series 


1+2+247+ 22> + 25%, 


As for a real variable, the nth partial sum of this series is 


1 — gat 
Sn Te? 
and 
(8.1) L+etepeee se for |z|< 1. 


We see that the geometric series converges absolutely provided |z| < 
1 and that the convergence is uniform provided |z| < q, where çq is 
any fixed positive number between 0 and 1. In other words, the geo- 
metric series converges absolutely for all values of z within the unit 
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circle and converges uniformly in every closed circle concentric with the 
unit circle and with a radius less than unity. 

For the investigation of convergence the comparison test 1s again 
available: If |cv| < pv, where pv is real and nonnegative and if the 
infinite series 


converges, then the complex series cv converges absolutely. 

If the pv's are constants, while the cv’s depend on a point z varying 
in R, the series >jcv converges uniformly in the region in question. 
The proofs are the same, word for word, as the corresponding proofs 
for a real variable (Volume I, Chapter 7, p. 535) and therefore need 
not be repeated here. 

If M is an arbitrary positive constant and q a positive number 
between 0 and 1, the infinite series with the positive terms pv = Mq” 
or Mqv"! or 


M 
v+1 


v+1 
qt 


also converge, as we know from Volume I, p. 543. We shall immedi- 
ately make use of these series for purposes of comparison. 


b. Power Series 


The most important infinite series with complex terms are power 
series, in which cv is of the form cy = avz’; that is, a power series 
may be expressed in the form 


P(2) = Dave’ 
v=0 
or, somewhat more generally, in the form 


2 a(z — Zo)’, 
where zo is a fixed point. As this form can, however, always be re- 
duced to the preceding one by the substitution 2’ = z — zo, we need 
only consider the case where zo = 0. 
The main theorem on power series is word for word the same as 
the corresponding theorem for real power series in Chapter 7 of 
Volume I (p. 541): 
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If the power series converges for z = č, it converges absolutely for 
every value of z such that |z| < |¢|. Further, if q is a positive number 
less than 1, the series converges uniformly within the circle |z| < 


qlél. 


We can at once proceed to the following further theorem: 


The two series 


D (2) = 2, vayzy+i 


o0 


— av v+1 
He) = 241? 


also converge absolutely and uniformly if |z|S q|¢\. 

The proof follows exactly as before. Since the series P(z) converges 
for z = €, it follows that the nth term, an&", tends to zero as n in- 
creases. Hence, a positive constant M certainly exists such that the 
inequality |an&"| < M holds for all values of n. If now |z| = qg|&l, 
where 0 < q < 1, we have 


M\é| 
n+1 


an n+l 


n+1 < 


M 
|anz"| < Mqar, |nanz" | <7 ng", qrt, 


[S| 


We thus obtain comparison series that, as we have seen already 
(p. 771), converge absolutely. Our theorem is thus proved. 

In the case of a power series there are two possibilities: either it 
converges for all values of z or there are valves z = y for which it 
diverges. Then, by the preceding theorem, the series must diverge for 
all values of z for which |z| > |n] (cf. Volume I, p. 541), and just as in 
the case of real power series, there 1s a radius of convergence p such 
that the series converges when |z| < p and diverges when |z| > p. 
The same applies to the two series D(z) and (z), the value of p being 
the same as for the original series. The circle |z| = p is called the 
circle of convergence of the power series. No general statement can be 
made about the convergence or divergence of the series on the 
circumference of the circle itself, that is, for |z| = p. 


c. Differentiation and Integration of Power Series 


A convergent power series 


P(z) = Dy a2” 
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defines a function of the complex variable z in the interior of its circle 
of convergence. In that region it is the limit to which the polynomials 


P n(Z) = >" Avzv 
v=0 


tend as n tends to infinity. 

A polynomial f(z) may be differentiated with respect to the in- 
dependent variable z in exactly the same way as for a real variable. In 
the first place, we notice that the algebraic identity 


e1— z 


= gyr-l + zı”? 2 + eee + gn-l 


holds. If we now let zı tend to z, ! we immediately have 


In the same way, we immediately have 


Pr(z1) — Plz) _ 2 


Pala) = ge Pale) = lim SNS = Svar = Dale 


We naturally call the expression Pn’(z) the derivative of the complex 
polynomial P,(2). 

We now have the following theorem, which is fundamental in the 
theory of power series: 


A convergent power series 
(8.2a) P(z) = 5 ayz” 
v=0 


may be differentiated term by term in the interior of its circle of con- 
vergence. That is, the limit 


(8.2b) P'(z2) = lim P(zi) — P@) 


47% 21 — 2 
exists, and 


1The concept of a limit for a continuous complex variable (zı > z) can be introduced 
in exactly the same way as for a real variable. 
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(8.2) PÐ = H vayz™! = lim Pn'(z) = lim Da(z) = D(2). 


no 


From this theorem it is at once clear that the power series 


= = av v+l 
2) = 2.544? 


may be regarded as the indefinite integral of the first power series, that 
is, that I’(z) = P(2). 

The term-by-term differentiability of the power series is proved in 
the following way: 

From p. 773 we know that the relation 


D(z) = lim D(z) 


holds within the circle of convergence. We have to prove that the 
difference quotient 


P(z1) — P(2) 


Z1—@ 


differs in absolute value from D(z) by less than a prescribed positive 
number € if only we take zı sufficiently close to z within the circle 
of convergence. For this purpose, we form the difference quotient 


where for brevity we write 


Ay = a tt ov tz te ee + gv 
z-z 


If we keep to the notation used on p. 773 and if |z|< q|&| and |zı| < 
q|&§], then 


As] S vay" Ey. 
Hence, 


M 


|Ra| =| È aw 
v=n+1 El y 


S21 levlvar El" S g 2a var’. 
v=nt+1 v=nt 
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Owing to the convergence of the series of positive terms >} vq’—!, the 
expression | R,| can therefore be made as small as we please, provided 
we make n sufficiently large. We choose n so large that this expression 
is less than ¢/3 and so large—increasing n further if necessary—that 


| D(z) — Dnl(z)| < £/3. 
We now choose zı so close to z that the absolute value of 


P,(21) — Pil(Z) 
Z1—2 


also differs from Dn(z) by less than ¢/3. Then, 


P,(21) — Pr(2) 


| D(a, 2) — De s | 


— Dnr(2) 
+ |Da(z2) — D(z)| + | Ra! 


E E€ 
< ta tg 5s, 


wlm 


and this inequality expresses the fact asserted. 

Since the derivative of the function is again a power series with 
the same radius of convergence, we can differentiate again and repeat 
the process as often as we like. That is, a power series can be differ- 
entiated as often as we please in the interior of its circle of convergence. 

Power series are the Taylor series of the functions P(z) that they rep- 
resent; that is, the coefficients ay may be expressed by the formula 


1 
(8.3) dy = ype (0). 


The proof is word for word the same as for a real variable (cf. 
Volume I, p. 545). 


d. Examples of Power Series 


As we mentioned in Chapter 7 (p. 553) of Volume I, the power 
series for the elementary functions can immediately be extended to 
the complex variable; in other words, we can regard the power series 
for the elementary functions as complex power series and extend the 
definitions of these functions to the complex realm in this way. For 
example, the series 
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v Z 2v co (— 1)’z2vt1 


oo 2y o0 œ 
aw ACOD ae h EaD evr Gaver! 


converge for all values of z. (This follows at once from comparison 
tests.) The functions represented by these power series are again 
denoted, respectively, by the symbols e?, cos z, sin z, cosh z, sinh z, 
just as in the real case. The relations 


(8.4a) cos 2+ i sin Z = e#, 
(8.4b) cosh z = cos iz, i sinh z = sin iz 


now follow immediately from the power series. Again, by differentiat- 
ing term by term, we obtain the relation 


d 
—— pZ — pg? 
(8.4c) de’ =e 
As examples of power series with a finite radius of convergence, 
other than the geometric series, we consider the series 


vt1 zy 


(8.4d) log (1 + 2) = = (-1) = 


Vv 


arc tan z = x (— ) = i = > [log (1 + iz) — log (1 — iz)], 


whose sums we again denote by log and arc tan. Here the radius of 
convergence is again 1. Differentiating term by term, we obtain 
geometric series and find 


d log(1+ 2) _ 1 
dz ~ 1+2’ dz 


1. 
d arc tan z) = idee: 


Exercises 8.1 


1. (a) Show that the operation of taking the conjugate of a complex number 
distributes over rational algebraic operations, for example, 


aB = af. 
(b) Prove that if f(z) is defined by a power series with real coefficients, 
then f(z) = f(z). 
2. (a) Prove for a polynomial P(z) with real coefficients that « is a root if 
and only if its complex conjugate is a root. 


(b) Prove under the assumption above that if P(«) = 0 and « is not real, 
a = a + ib and b ¥ 0, then P(z) has the real quadratic factor. 
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(z — a) (z — &) = 2? — 2az + a? + b?. 


3. (a) Show that |z—a| =A|z—6|,A+1, A real is the equation of a 
circle. Determine the center zo and the radius r of the circle. If A = 1 
what is the locus of this equation? 


(b) Show that the general linear transformation 
y= az +B l 
yz +% 
where «a5 — By > 0, transforms circles and straight lines into circles 
and straight lines. 
4. For which points z = x + iy is 


z—1 

z+1 

5. Prove that if 2 an z” is absolutely convergent for z = ¢, then it is uni- 
formly convergent for every z such that |z| < |¢|. 

6. Using the power series for cos z and sin z, show that 


<1? 


cos?z + sin?z = 1. 


7. For what values of z is 


convergent? 


8.2 Foundations of the General Theory of Functions of a 
Complex Variable 


a. The Postulate of Differentiability 


As we have seen above, all functions that are represented by 
power series possess a derivative and an indefinite integral. This fact 
may be made the starting point for the general theory of functions 
of a complex variable. The object of such a theory is to extend the 
differential and integral calculus to functions of a complex variable. 
In particular, it is important that the concept of function should be 
generalized for complex independent variables in such a way that it 
comprises any function that is differentiable in a complex region. 

We could, of course, confine ourselves from the very beginning 
to the consideration of functions that are represented by power series 
and thus satisfy the postulate of differentiability. There are, however, 
two objections to this procedure. In the first place, we cannot tell a 
priori whether the postulate of the differentiability of a complex 
function necessarily implies that the function can be expanded in a 
power series. (In the case of the real variable we saw that functions 
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even exist that possess derivatives of any order and yet cannot be 
expanded in a power series; cf. Volume I, p. 462.) In the second place, 
we learn even from the the simple function 1/(1 — z), whose power 
series, the geometric series, converges in the unit circle only, that 
even for simple functional expressions the power series does not 
everywhere represent the function, which in this particular case we 
already know in other ways. 

These difficulties can be avoided by a method of Weierstrass, and 
the theory of functions of a complex variable can actually be de- 
veloped on the basis of the theory of power series. It is desirable, 
however, to emphasize another point of view, that of Cauchy and 
Riemann. In their method, functions are characterized not by explicit 
expressions but by simple properties. More precisely, the property that 
a function shall be differentiable, and not that it shall be capable of 
being represented by a power series, is to be used to mark out the 
domain in which a function is defined. 

We start from the general concept of a complex function 6 = f(z) 
of the complex variable z. If R is a region of the z-plane and if with 
every point z = x + iyin R we associate a complex number ¢ = u + iv 
by means of any relation, ¢ is said to be a complex function of z in 
R. This definition, therefore, merely expresses the fact that every pair 
of real numbers x, y, such that the point (x, y) lies in R, has a cor- 
responding pair of real numbers u, v, that is, that u and v are any 
two real functions u(x, y) and u(x, y), defined in R, of the two real 
variables x and y. 

This concept of function embraces too much for complex calculus. 
We limit it in the first place by the condition that u(x, y) and v(x, y) 
must be continuous functions in R with continuous first derivatives 
Uz, Uy, Uz, Vy. Further, we insist that our expression u + iv = ¢ = f(z) 
= f(x + iy) shall be differentiable in R with respect to the complex in- 
dependent variable z; that is, the limit 


lim f (21) — f(z) = lim f(z + h) — f(z) — f(z) 
Z472 21-2 h-0 h 

shall exist for all values of z in R. This limit is then called the de- 

rivative of f(z). 

In order that the function may be differentiable, it is by no means 
sufficient that u and v should possess continuous derivatives with re- 
spect to x and y. Our postulate of differentiability implies far more 
than differentiability does for functions of real variables, since h = 
r + is can tend to zero through both real values (s = 0) and purely 
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imaginary values (r = 0) or in any other way, and the same limit f’(z) 
must result in all cases if the function is to be differentiable. 

If, for example, we put u = x, v = 0, that is, f(z) = f(x + iy) = x, 
we have a correspondence in which u(x, y) and u(x, y) are continu- 
ously differentiable. For the derivative of f with respect to z, however, 
by putting h = r, we obtain 

x+r-x 


lim £E +”) — f2) = lim ~ 17 —* = 1, 
r-0 r r-0 r 


whereas if we put h = is, we have 


m [he + 2m f(z) _ 


lim s-0 1s 


that is, we obtain two entirely different limits. For 6 = u + iv = x + 
2iy we similarly obtain different limits for the difference quotient as 
h tends to zero in different ways. 

Thus, in order to ensure the differentiability of f(z) with respect to 
z we have to impose yet another restriction. This fundamental fact 
in the theory of functions of a complex variable is expressed by the 
following theorem: 


If © = u(x, y) + iv(x, y) = f(z) = f(x + iy), where u(x, y) and 
u(x, y) are continuously differentiable, the necessary and sufficient 
conditions that the function f(z) be differentiable in the complex region 
are the so-called Cauchy-Riemann differential equations. 


(8.5a) Ug = Vy, Uy = — Ur. 


In every open set R where u and v are continuously differentiable and 
satisfy these conditions, f(z) is said to be an analytic! function of the 
complex variable z, and the derivative of f(z) is given by 


(8.5b) f'(2) = us + ive = Vy — iy = = (uy + ivy). 


We shall first show that the Cauchy-Riemann differential equations 
constitute a necessary condition. We assume that f’(z) exists. Ac- 


1The term holomorphic is also used. A deeper theorem, not proved here, asserts that 
for f differentiable in a region, the derivatives of u and v not only exist but automati- 
cally are continuous. Hence, actually, differentiability of f implies continuous 
differentiability. In what follows, however, we shall not make use of that theorem 
and always assume that the differentiable f considered have continuously differenti- 
able real and imaginary parts or, equivalently, that f’(z) is a continuous function of 
zZ. 
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cordingly, we must obtain the limit f’(z) by taking h equal to a real 
quantity r. That is, 


f'(2) = lim < +r, 2 — u(x, y) , Ue tT, ») — u(x, 2) 


= Uzr + iUz. 


In the same way, we must obtain f(z) if we take h to be a pure imagi- 
nary is; that is, we must have 


u(x, y + s) — u(x, y) +i u(x, y + s) — U(x, 2) 


f (2) - lim lS us 


Hence, 
. 1 . 
Ug + We = F (uy + ivy). 


By equating real and imaginary parts, we at once obtain the Cauchy- 
Riemann equations. 

These equations, however, also form a sufficient condition for the 
differentiability of the function f(z). To prove this, we form the differ- 
ence quotient [see formula (13) p. 41] 


fle + h) — fl) _ ux +r, y+ s) — u(x, y) + ifla + r, y +s) uay) 
h r+is 
TUz + SUy + irVs + isvy + &ı|h| + iez|h| 
r+is , 
where £1 and €2 are two real quantities that tend to zero with |h| = 
vr? + s2 . If now the Cauchy-Riemann equations hold, the above 
expression immediately becomes 


[A] 


r+ is 


|A| 
r+is `’ 


Uz + iUz + £1 + 1&2 
We see at once that as h —> 0, this expression tends to the limit uz + 
lUz independently of the way in which the passage to the limit h —> 0 
is carried out. 
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We now use the Cauchy-Riemann equations, or the property of 
differentiability that is equivalent to them, as the definition of an 
analytic function, on which we shall base our deduction of all the 
properties of such functions. 


b. The Simplest Operations of the Differential Calculus 


All polynomials and all power series in the interior of their circle 
of convergence are analytic functions (see p. 776). We see at once that 
the operations that lead to the elementary rules of the differential 
calculus can be carried out in exactly the same way as for the real 
variable (see Volume I, pp. 201-206, 218-220). In particular, the 
following rules hold: The sum, the difference, the product, and 
(provided the denominator does not vanish) the quotient of analytic 
functions can be differentiated according to the elementary rules 
of the calculus and, hence, are again analytic functions. Further, an 
analytic function of an analytic function can be differentiated ac- 
cording to the chain rule and therefore is itself an analytic function. 

We also note the following theorem: 


If the derivative of an analytic function € = f(z) vanishes everywhere 
in a region R, the function is a constant. 

PROOF. We have by (8.5a, b) vy — iuy = 0 everywhere in R. Hence, 
Uy = 0, uy = 0, and by virtue of the Cauchy-Riemann equations, 
Ur = 0, Uz = 0; that is, u and v are constants; hence, ¢ is a constant. 


Application to the Exponential Function 


We use this theorem to derive some of the basic properties of the 
exponential function, defined for all complex z by the power series 


2 
+ eee, 


IE 


o gk z 
= Zatti tat 
Since we may differentiate this series (see p. 776), we find that 


d 2? 
(8.6) dee tl tet tc He 


Thus, the exponential function f(z) = e? is a solution of the differential 
equation 


f(z) = f@) 


for all z. By the chain rule of differentiation, it follows then for any 
fixed complex ¢ that 
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S etet= = fle +O f(-2) 
= f'(z + 9 f(—2) — f(z + 59) f(-2) 
= f(z + 9 f(—2) — f(z + 9 f(—2)=0. 
Using the theorem above, we see that 
e?t% e~z 


is a constant independent of z. We find the value of this constant by 
putting z = 0, and since e® = 1, obtain 


(8.6a) ezt ez = e 
for all z and ¢. For ¢ = 0 it follows that 
(8.6b) e e =l. 


Consequently, the exponential function is different from zero for all 
complex z and the reciprocal of e? is e~2. Multiplying both sides of the 
identity (8.6a) by e7 we arrive at the functional equation of the ex- 
ponential function 


(8.6c) e2tt = eže, 


which could not be derived as easily directly from the power series 


representation. 
If f(z) is any solution of the differential equation 
(8.7a) f(z) = f(z) 
we have 
2 fe = fee — fiee = 0 
dz ` 
Hence, 


f(z)e-* = constant = c. 


Thus, the most general solution of the differential equation (8.7a) has 
the form 


(8.7b) f(z) = ce? 


where c is a constant. 
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We found on p. 777 that 
(8.8a) etz = cos z + i sin Z, 


where cos z and sin z are defined by their power series. Replacing z 
by —z, we find, since sin (—z) = —sin z 


e~z = cos Z— i sin Z. 
Multiplying the two relations, we see that 

eiz ez = cos?z + sin?z. 
Since et? ez = eętz—iz = 1, we have proved the identity 
(8.8b) cos?z + sin?z = 1 


for all complex z. 
By (8.6c) and (8.8a), 


(8.8c) ettiy = eteiy = e%(cos y + i sin y). 


If here x and y are real, we find that the absolute value of e? = e*t#y 
is given by 


(8.8d) |e] = |et+tv| = |e? cos y + ie? sin y| 
= (e? cos y)? + (e? sin y)? = Ve*(cos?y + sin?y) 
= e7. 


Another important consequence of the relation (8.8a) connecting 
the exponential and trigonometric functions is obtained if we put z 
= 2r: 


(8.9a) eri = cos(2n) + isin(2r) = 1. 
More generally, from (8.6c) for G = 2ri, we have 
(8.9b) eztent — ez, 


Thus, for complex arguments the exponential function is periodic and 
has the period 2ni. 


Formula (8.8a) shows that for any integer n 


(8.9c) enmi = cos(2nr) + i sin(2nn) = 1. 
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One easily sees that the values z of the form 
z = 2nīi (n = integer) 
are the only ones for which 
e=], 


for if z = x + iy, with real x, y, we find from e* = 1 and (8.8d) that e? = 
1, and hence, x = 0. Then 


l1 = e” = cos y + i sin y, 
which yields 
cos y = 1, sin y = 0. 


Hence, y must be a multiple of 2r. 
We conclude that an equation 


(8.9d) e? = e 

can hold if and only if 

(8.9e) z = ģ + 2nnti, 

where n is an integer, for multiplying (8.9d) by e™%, we get 


et = ee~ = i. 


c. Conformal Transformation. Inverse Functions 


By means of the functions u(x, y) and v(x, y) the points of the z- 
plane or x, y-plane are made to correspond to points of the C-plane or 
u, v-plane. Thus, we have a transformation or mapping of regions of 
the x, y-plane onto regions of the u, v-plane determined by © = f(z) = 
u + iv. By (8.5a, b), p. 780, the Jacobian of the transformation is 


_ duw) _ — y2 2 — 1f"(>)/2 
= d(x,y) = UzUy Uys = Uz + Uz? = | f’(z) | r 
The Jacobian is therefore different from zero and is, in fact, positive 
wherever f'(z) + 0. If we assume that f'(z) # 0, our previous results 
(p. 261) show that a neighborhood of the point zo in the z-plane, if 
sufficiently small, is mapped 1-1 and continuously on a region of the 
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C-plane in the neighborhood of the point Co = f(zo). This mapping is 
conformal (i.e., angles are unchanged by it), for as we have seen in 
Chapter 3 (p. 288), the Cauchy-Riemann equations are the necessary 
and sufficient conditions for the transformation to be conformal and 
to preserve not only the magnitude but also the sign of angles. We 
thus have the following result: 


Conformality of the transformation given by u(x, y) and u(x, y) and 
analytic character of the function f(z) = u + iv mean exactly the same 
thing, provided we avoid points zo for which f'(zo) = 0. 

The reader should study the examples of conformal representation 
discussed in Chapter 3 (pp. 243-244) and prove that all these trans- 
formations can be expressed by analytic functions of simple form. 

For a 1-1 conformal representation of a neighborhood of zo on a 
neighborhood of Co, the reverse transformation is also conformal. It 
follows that z = x + iy may also be regarded as an analytic function 
o(C) of C = u + iv. This function is called the inverse of ¢ = f(z). 

Instead of using this geometrical argument, we can establish the 
analytic character of this inverse directly by calculating the deriva- 
tives of x(u, v), y(u, v) as in (24d) on p. 0000. We have 


U u U u 
(8.10) Xu = D Xv = — D> Yu = — D Yv = D’ 
and we see that the Cauchy-Riemann equations xu = yo, Xv = — Yu 


are satisfied by the inverse function. As we can at once verify, the 
derivative of the inverse z = ¢(C) of the function 6 = f(z) is given by 
the formula 


(8.10b) 


Exercises 8.2 


1. Prove that the product and the quotient of analytic functions and the 
function of an analytic function are again analytic, using not the prop- 
erty of differentiability but the Cauchy-Riemann differential equations. 

2. Show that if |f(z)| is constant in a region R, then f(z) is constant. 

3. Where are the following functions continuous? Which ones are differen- 
tiable? 


(a) z; œ) Izl; ©) fF (d) e 


4. Prove that in the transformation ¢ = (z + 1/z) the circles with cen- 
ters at the origin and the straight lines through the origin of the z-plane 
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are respectively transformed into confocal ellipses and hyperbolas in 
the ¢-plane. 


5. For the general linear transformation 


_az+b 


Sad 


(ad — bc + 0), 


there may be as many as two fixed points, values of z for which ¢ = z. 
Show that if the transformation does have two fixed points, the family 
of circles through the two fixed points and the family of circles or- 
thogonal to them transform into themselves. (For this purpose the 
straight line through the points and the perpendicular bisector of the 
segment joining them are considered to be “circles” of the respective 
families. 


6. Relate the inversion in the unit circle to the analytic function f(z) = 1/z 
and thus derive the basic properties of inversion stated in Section 3.3d, 
Exercise 4, p. 256. 


7. Prove that a substitution of the form 


C= az + B 
Bz+ a’ 
where « and 6 are any complex numbers satisfying the relation 
ad — BB = l, 


transforms the circumference of the unit circle into itself and the interior 
of the circle into itself. Prove also that if 


BB — at = 1, 
the interior is transformed into the exterior. 


8. Prove that any circle may be transformed by a substitution of the form 
C = (az + B)/(yz + 5) into the upper half-plane bounded by the real 
axis. (Use Exercise 4, p. 778.) 


9. Prove that a substitution ¢ = («z+8)/(yz2+ 5), where «ò — By + 0, 
leaves the cross ratio 


(21 —-z23)/(Z2 — 23) 
(zı — za)| (z2 — za) 


of four points 21, Z2, z3, z4 unaltered. 
8.3 The Integration of Analytic Functions 


a. Definition of the Integral 


The central theorem of the differential and integral calculus of 
functions of a real variable is that the indefinite integral of a function 
(the upper limit being undetermined) may be regarded as the primitive 
function or antiderivative of the original function (Volume I, p. 188). 
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A corresponding relation forms the nucleus of the theory of analytic 
functions of a complex variable. 

We begin by extending the definition of the definite integral of a 
given function f(z). Here it is convenient to use t = r + is, instead of 
the independent variable z, to denote the variable of integration. Let 
the function f(t) be analytic in a region R, and let t = ło and t = z 
be two points in this region, joined by an oriented curve C that is 
sectionally smooth (see p. 88) and lies wholly within R (Fig. 8.2). 
We then subdivide the curve C into n portions by means of the succes- 


sive points fo, fi, . . ., én = z and form the sum 
Figure 8.2 
(8.11a) Sn = X f(te’) (tv — tv1), 


where tv denotes any point lying on C between fy-1 and tv. If we now 
make the subdivision finer and finer by letting the number of points 
increase without limit in such a way that the greatest of the lengths 
|tv — tv_1] tends to zero, Sn tends to a limit that is independent of the 
choice of the particular intermediate point tv and of the points tv. 

This can be proved directly by a method analogous to that used to 
prove the corresponding theorem of the existence of the definite inte- 
gral for real variables. For our purpose, however, it is more con- 
venient to reduce the theorem to what we already know about real 
curvilinear integrals (cf. Chapter 1, p. 89) as follows: We put f(t) = 
u(r, s) + iv(r, s), tv = rv + isv, tv = rv’ + isv, Aty = tv — tv-1 = Arv + 
i Asv. Then, we have 


n 
Sr = 2 u(rv’, Sv’) Ary — u(Tv’, Sv’) Asv 


+i 5° u(rv’, sv’) Arv + u(rv’ , sv’ ) Asv}. 
r=1 
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As n increases the sums on the right side tend to the real curvilinear 
integrals 


fq dx — v dy) and if (v dx + u dy), 


respectively, and hence, as we asserted, Sn tends to a limit. We call 
this limit the definite integral of the function f(t) along the curve C 
from to to z and write it 


f , f(t) dt or f f(t) dt. 
Thus, 
(8.11b) f f(t) dt = f, (u dx — v dy) + if (v dx + u dy). 


The definition of this definite integral at once gives an important es- 
timate: If |f(t)| < M on the path of integration, where M is a constant 
and L is the length of the path of integration, then 


(8.110) | f f@ at | < ML, 


for by (8. 11a) and Volume I (p. 350), 
| Sal < M >; | tr — tr_1| < ML. 


In addition, we point out that operations with complex integrals 
(in particular, combinations of different paths of integration) satisfy 
all the rules stated in this connection for curvilinear integrals in 
Chapter 1 (pp. 93-95). 


b. Cauchy’s Theorem 


The most important property of functions of a complex variable is 
that the integral between to and z is largely independent of the choice 
of the path of integration C. In fact, we have Cauchy’s theorem: 


If the function f(t) is analytic in a simply connected region R, the 
integral 


f , f(t) dt = f f(t) dt 
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is independent of the particular choice of the path of integration C join- 
ing to and z in R; the integral is an analytic function F(z) such that 


£ Ree) = Al ff at] = fe) 


F(z) is accordingly a primitive function or indefinite integral of f(z). 
Cauchy’s theorem may also be expressed as follows: 


The integral of f(t) around a closed curve lying in a simply connected 
region in which f is analytic, has the value zero. 

The proof that the integral is independent of the path follows im- 
mediately from (8. 11b) and the main theorem on curvilinear integrals 
(cf. Chapter 1, p. 104); for both u dx — v dy, the integrand in the real 
part, and v dx + u dy, the integrand in the imaginary part, satisfy 
the condition of integrability, by virtue of the Cauchy-Riemann equa- 
tions (8.5a). Thus the integral is a function of x, y or of x + iy = z, 
F(z) = U(x, y) + iV(x, y), and from our previous results for curvilinear 
integrals, we have the relations 


Uz; = u, U; = — VU, Vz = v, Vy = u, 
that is [see (8.5b), p. 780], 
Uz; = Vy, Uy = — Vz, Uz + iVz = u + iv, 


which shows that F(z) is actually an analytic function in R with the 
derivative F’(z) = f(z). 

The assumption that the region is simply-connected is essential 
for the validity of Cauchy’s theorem. For example, consider the func- 
tion 1/t, which is analytic everywhere in the ¢-plane except at the ori- 
gin. We are not entitled to conclude from Cauchy’s theorem that the 
integral of 1/t, taken around a closed curve enclosing the origin, 
vanishes, for such a curve cannot be enclosed in a simply connected 
region in which the function is analytic. The simple connectivity 
of the region is destroyed by the exceptional point ¢ = 0. If, for ex- 
ample, we take the integral around a circle K given by |t| = r or t = 
re‘® in the positive sense and make 9 the variable of integration (dt = 
rie? d0), we have 


2T wt pi0 
(8.12a) JË- f ERD = ani; 


that is, the value of the integral is not zero but 271. 
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We can, however, extend Cauchy’s theorem to multiply connected 
regions as follows: 


If a multiply connected region R is bounded by a finite number of 
sectionally smooth closed curves C1,C2,. . . and if f(z) is analytic in the 
interior of this region and on its boundary, then the sum of the integrals 
of the function along all the boundary curves is zero, provided that all 
the boundaries are described in the same sense relative to the interior 
of the region R, that is. that the region R is always on the same side, 
say the left-hand side, of the curve as it is described. 


The proof follows at once, on the model of the corresponding proofs 
for curvilinear integrals: We cut up the region R into a finite number 
of simply-connected regions (Figs. 8.3 and 8.4), apply Cauchy’s theorem 


Figure 8.3 So = Se. + Soca’ Figure 8.4 A multiply connected region 
R subdivided by segments Qı, Q2, . . . into 
simply connected regions. 


to these regions separately, and add the results. We can express this 
theorem in a somewhat different way: 


If the region R is formed from the interior of a closed curve C by 


cutting out of this interior the interiors of further curves Ci, C2,..., 
then 
(8.12b) J, tO dt Jo. f(t) dt, 


where the integrals around the external boundary C and the internal 
boundaries are to be taken in the same sense. 


1A function is said to be analytic on a curve if it is analytic throughout a neighbor- 
hood, no matter how small, of this curve. 
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c. Applications. The Logarithm, the Exponential Function, and the 
General Power Function 


We can now use Cauchy’s theorem as the basis for a satisfactory 
theory of the logarithm, the exponential function, and hence the other 
elementary functions, following a procedure similar to that adopted 
for a real variable (Volume I, Chapter 2, p. 145). 

We begin by defining the logarithm as the integral of the function 
1/t. At first, we limit the path of integration by making it lie in a 
simply connected region of analyticity by making a cut along the 
negative real x-axis, that is, by permitting no path of integration to 
cross the negative real axis. More precisely, if we put t = |t|(cos 0 + 
i sin 9), we limit 0 by the inequality —z < 9 < n. In the t-plane, after 
the cut has been made, we join the point t = 1 to an arbitrary point z 
by any curve C, and we can then use Cauchy’s theorem to integrate 
the function 1/t between these two points, independently of the path. 
The result is an analytic function that we call log z and that is defined 
uniquely for z + 0: 


(8.12c) ¢=logz= f z dt _ f(z). 
1 
The logarithm has the property that 
d _i 
(8.12d) dz 108 z) =z 


The inverse of the logarithm can be identified with the exponential 
function. We consider the function e?°8 z defined for z # 0 in the plane 
slit along the negative real axis, in accordance with the definition of 
the logarithm. Using the chain rule of differentiation, we find from 
(8.12d) and (8.6) for z + 0: 


— — gl0g z — __ — elog z + — glog z 
dz z 
Hence, 
1 lo 
z e'98 2 = constant = c. 


If we take here z = 1, we find that 


c = ele 1 = e=]. 
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Thus, 
(8.13a) elogz = z for all z #0. 
Equation (8.18a) shows that the equation 
(8.13b) ev =z 
has at least one solution w for every z # 0, namely, 
(8.13c) w = log z. 


Hence, the exponential function assumes all complex values but zero. 

The solution, however, is not unique. We know from p. 785 that if 
w is any particular solution of (8.13b), then the general solution has 
the form 


w + 2nīi, 


where n is an integer. Hence: 
For any z + 0 the equation 
(8.13d) ev =z 
is equivalent to 
(8.13e) w = log z +2nni, 


where n is an integer. 
As an application we derive the addition theorem for logarithms. 
We have for any complex z, ¢ that do not vanish, from (8.13a) 


zt = elog z el08 € — elog z + loge 


and, on the other hand, 


z% = ẹ¢l08(z%), 


1Qne is tempted to conclude similarly from 


d al gz 
dz 108 (6) = a € =] 


that 
g(z) = log(e?) — z = constant. 


But this is wrong, since g(0) = 0 and g(2ni) = —2ni. It is left to the reader to discover 
the fallacy of the argument. 
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Hence, 
(8.14) log(z%) = log z + log% + 2nzi, 


where n is an integer. Here, for positive real z, € we can always take 
n = 0 but not for others, as the following example shows. 
The integral 


2 dt 
log z =f t 


is easily evaluated explicitly by taking the straight line joining the 
points ¢ = 1 and t = |z| together with the circular arc |t| = |z| as the 
path of integration. Setting t = |z|e*¢ on the circle, we have 


jz 0 
(8.15) log z= Í of i aC = log |z| + i0, 


where 9 is the argument of the complex number z (Fig. 8.5) For ex- 
ample, 


log(—1) = ti. 


log 1 = 0, log i = 5, 


Figure 8.5 Log z= log |z| + ið. 


We notice that 
log [(—1) (—1)] = log 1 = 0 = log(—1) + log (—1) — 2n1. 


Thus, in formula (8.14), we cannot take n = 0 when z=€ = —1. 
The value obtained in this way for the logarithm of any complex 

number z, whose argument lies in the interval -r <@< 17, is often 

called the principal value of the logarithm. This terminology is 
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justified by the fact that other values of the logarithm can be obtained 
by removing the condition that the negative real axis must not be 
crossed. We can then join the point 1 to the point z by a path that en- 
closes the origin ¢ = 0. On this curve, the argument of t will increase 
up to a value that is greater or less than the argument previously as- 
signed to z by 2r. We then have the value 


log z = log |z| + ið + 2ri 


for the integral (Fig. 8.6). In the same way, by making the curve travel 
around the origin in one direction or the other any integral number of 
times n, we obtain the value 


(8.16) log z = log |z| + 10 + 2nz1. 


This expresses the many-valuedness of the logarithm. Formula (8.16) 
represents the general solution of the equation e!°8 2 = z, 


Figure 8.6 Log z = log |z| + i0 + 2ni. 


Now that we have introduced the logarithm and the exponential 
function it is easy to define the general power functions a? and 2°, 
where a and a are complex constants (cf. the corresponding discus- 
sion for the real variable in Volume I, p. 152). We define a? by the re- 
lation 


(8.16a) a? = e* loga (a + 0), 


where the principal value of log a is to be taken. In the same way we 
define z* by the relation 


10f course, the many-valued logarithm is not a function in the sense of a univalent 
assignment of a.complex logarithm to each number 2; the principal value is a func- 
tion in that sense. 
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(8.16b) za = et log z (z + 0). 


While the function a? is defined uniquely if we use the principal 
value of log a in the definition, the many-valuedness of the function 
za goes deeper. Taking the many-valuedness of log z into account, we 
see that along with any one value of z* we also have all the other 
values obtained by multiplying one value by e?%ria, where n is any 
positive or negative integer. If a is rational, say a = p/q, where p and q 
are integers prime to one another, among these multipliers there are 
only a finite number of different values (whose qth power must be 
unity). If, however, a is irrational, we obtain an infinite number of dif- 
ferent multipliers. The many-valuedness of the function ze will be 
discussed in greater detail on p. 815. 

As we see from the chain rule, these functions satisfy the dif- 
ferentiation formulae 


(8.16c) ae = a? log a, ae = azot, 
Exercises 8.3 
1. Consider Í = — 2 dz. 


(a) What are the values of this integral taken counterclockwise around 
small circles centered at 1 and at —1? 
(b) Describe a closed path surrounding both 1 and —1 about which the 
integral is zero. 
2. Investigate the extensions of the laws of exponents, 


atat = ast. sate — (st)2, (a®*)é = gst = (a‘)8, 
from the real to the complex domain and discuss the complications that 
arise from many-valuedness in the definition za = exp[« (log z + 2nzi)], 
where log z is the principal value of the logarithm. 
3. (a) Show that all values of it are real. 
(b) Find general conditions on complex z (z = 0) and € such that all 


values of z% are real. 
(c) Is it possible to choose real x and £, such that all the values of x$ are 


real? 
4. The gamma function: Prove that the integral 


— i z—l p—t 
I(z) = f p ee dt, 
where the principal value of t?-! is taken, extended over all real values of 


the variable of integration t, is an analytic function of the parameter z = 
x + iy if x > 0. Show directly that the expression T(z) can be differen- 
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tiated with respect to z. Prove that the gamma function thus defined for 
the complex variable satisfies the functional equation T(z + 1) = 
2T(2). 

5. Riemann’s zeta function: Taking the principal value of n2, form the in- 
finite series 


PELLO) (z= x + iy), 


Prove that this series converges if x > 1 and represents a differentiable 
function [@(z) is called Riemann’s zeta function]. The proof can be carried 
out directly by a method like that for power series (cf. Volume I, p. 525). 


6. (a) Apply Cauchy’s theorem to the integral 
fhe +4)” 2 az (n>m> 0) 


taken along a path consisting of the positive quadrant of the unit 
circle |z|= 1 and the parts of the axes between the origin and the 
circle, a small circular detour being made round z = 0; and deduce 
that 


sin [(n — m)r/2] T(m + 1) T{(n — m)/2] 
Qm+i T[(n + m)/2 + 1] 
(b) Prove that if n = m the value of the latter integral is x/2™+1. (In the 


complex integral the integrand may be taken as real on the positive 
half of the axis.) 


Í ~ cos™6 cos nô dð = 


8.4 Cauchy’s Formula and Its Applications 


a. Cauchy’s Formula 


Cauchy’s theorem for multiply connected regions leads to a 
fundamental formula, again Cauchy’s, which expresses the value 
of an analytic function f(z) at any point z = a in the interior of a 
closed region R throughout which the function is analytic, by means 
of the values that the function takes on the boundary C. 

We assume that the function f(z) is analytic in the simply con- 
nected region R and on its boundary C. Then the function 


go) = I 


is analytic everywhere in the region R, the boundary C included, 
except at the point z = a. Out of the region R we cut a circle of small 
radius p about the point z = a, lying entirely within R (Fig. 8.7), and 
then apply Cauchy’s theorem (p. 790) to the function g(2). If K denotes 
the circumference of the circle described in the positive sense and C 
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Figure 8.7 


the boundary of R described in the positive sense, Cauchy’s theorem 
states that [see (8.12b), p. 791] 


f g(z) dz = f, g(z) dz. 
On the circle K we have z — a = pet?, where the angle 0 determines 


the position of the point on the circumference. On the circle, there- 
fore, dz = pie? d0, and hence, 


f g(z) dz =i Í e f(a + pet?) dd. 


Since f(z) is continuous at the point a, we have, provided p is sufficient- 
ly small, 


f(a + pe?) = f(a) + n, 


where |n| is less than an arbitrary prescribed positive quantity €. 
Hence, 


on . 2n 2n 
f, fla + pet) do — |” fla) do = f, n do| < 2re, 
and therefore, 
J f(a + pet?) d0 = 2nf(a) + x, 
where |x| < 2ze. Thus, if p is sufficiently small, 


Í o g(z) dz = 2nif(a) + Ki, 


where |«i| < 2re. 
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If we make e€ tend to zero (by making p tend to zero), the right 


side of the equation tends to 2zif(a), while the value of the left side, 
namely, 


f, a2) dz, 


is unaltered. We thus obtain Cauchy’s fundamental integral formula 


(8.17a) = Lf- fz) - dz. 


If we now revert to the use of t as variable of integration and then 
replace a by z, the formula takes the form 


(8.17b) o = sf a. 


2ni 


This formula expresses the values of a function in the interior of 
a closed region in which the function is analytic by means of the 
values that the function takes on the boundary of the region. 

In particular, if C is a circle t = z + re? with center z—that is, if 
dt = iré? d0—then 


f(z) = = fr f(z + ret®) d8. 


In words, the value of a function at the center of a circular disk is equal 
to the mean of its values on the circumference, provided that the circle 
and its interior are contained in a region where the function is analytic. 


b. Expansion of Analytic Functions in Power Series 


Cauchy’s formula has a number of important consequences. The 
chief of these is that every analytic function can be expanded in a power 
series, which connects the present theory with that in Section 8.1 
(p. 772). More precisely, we have the following theorem: 


If the function f(z) is analytic in the interior and on the boundary of 
a circle |z — zo| < R, it can be expanded as a power series in z — 20 
that converges in the interior of that circle. 

In proving this we can take zo = 0 without loss of generality. 
(Otherwise we could merely introduce a new independent variable 2’ 
by means of the transformation z — zo = 2’). We now apply Cauchy’s 
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integral formula (8.17b) to the circle C, |t| = R, and write the in- 
tegrand (using the geometric series) in the form 


fA _ fd 1 f(t) | 


= eee e = Oo" 


t-z t 1l-a2ft t Oe a 


1 2 

e+e — e 
tr t zt a t t 1 — 2/t 
Since z is a point in the interior of the circle, |z/t| = q is a positive 
number less than unity, and we estimate the remainder of the geo- 
metric series, 


by 


Introducing our expressions into Cauchy’s formula and integrating 
term by term, we obtain 


f(z) = co + ciz2++** + cnz” + Rn, 
where 


fO a 


3 c tvtl 


Lf {rn dt. 


Cy = 3: 


Rn 
~ Oni 


If M is an upper bound of the values of |f(t)| on the circumference of 
the circle, our estimate (8.11c) for complex integrals immediately 
gives 


a 


qui 
2rRM = 
1-q 


< 
|u| S on 


for the remainder. Since q < 1, this remainder tends to zero as n 
increases and we obtain the power series expansion for f(z), 


fi (2) = >H CvZ“, 


where 
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1 f(t) 
(8.18a) Cy = Oni c fvtl dt. 


Our assertion is thus proved. 

This theorem has important consequences. To begin with, we 
know from p. 776 that every power series can be differentiated as 
often as we please in the interior of its circle of convergence. Since 
every analytic function can be represented by a power series, it 
follows that the derivative of a function in the interior of a region where 
the function is analytic is also differentiable (i.e., is again an analytic 
function). In other words, the operation of differentiation does not lead 
us out of the class of analytic functions. As we already know that the 
same is true for the operation of integration, we see that differenti- 
ation and integration of analytic functions can be carried out without 
any restrictions. This is an agreeable state of affairs, which does not 
exist in the case of real functions. 

Since, as we saw in Section 8.1 (p. 776), every power series is the 
Taylor series of the function that it represents, it now follows in 
general that every analytic function can be expanded in the neighbor- 
hood of a point z = zo in a region R where the function is analytic in 
a Taylor series 


(8.18b) fle) = flea) + 3 FE e- aay 

The coefficients c, in (8.18a) are accordingly given by the formulae 
f(zo) _ Lf f(zo + t) 

(8.18c) vl = Oni wil dt. 


From this result we may also deduce an important fact about the 
radius of convergence of a power series. The Taylor series of a function 
f(z) in the neighborhood of a point z = zo converges in the interior of 
the largest circle whose interior lies wholly within the region where 
the function is defined and is analytic. 

By virtue of the theorems on differentiation and integration that 
we have now established as also valid for the complex variable, all 
the elementary functions of a real variable that we expanded in 
Taylor series have exactly the same Taylor series for a complex in- 
dependent variable. For most of these functions we have already seen 
that this is true. 

Here we may point out that, for example, the binomial series (cf. 
Volume I, p. 456). 
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(8.19a) +2 =r 
v=0 


is also valid for the complex variable if |z| < 1 and a is any complex 
exponent, provided that 


(8.19b) (1 + 2)* = ef l0g(l+z) 


is formed from the principal value of log (1 + 2). 

The fact that the radius of convergence of this series is equal to 
unity follows from what we have just said, together with the remark 
that the function (1 + z)*is no longer analytic at the point z = —1, 
for if it were, all the derivatives would exist there, which is certainly 
not the case. The circle with radius 1 with the point z = 0 as center 
is therefore the largest circle in the interior of which the function is 
still analytic. 

This example illustrates that the convergence behavior of power 
series, which real analysis leaves in mystery, becomes completely 
intelligible in the light of the fact that we have just proved. about the 
radius of convergence. 

For example, the failure of the geometric series representing 
1/(1 + 22) to converge on the unit circle is a simple consequence of the 
fact that the function is no longer analytic for z = i and z = — i. 
We also see now that the power series 


(8.20) 


which defines Bernoulli’s numbers (cf. Volume I, p. 562), must have 
the circle |z| = 2r as its circle of convergence, for the denominator of 
the function vanishes for z = 2ni but (apart from the origin) at no 
point interior to the circle |z| < 2n. 


c. The Theory of Functions and Potential Theory 


Since analytic functions f = u + iv may be differentiated as often 
as we please, it follows that the functions u(x, y) and u(x, y) also have 
continuous derivatives of any order. We may, therefore, differentiate 
the Cauchy-Riemann equations. If we differentiate the first equation 
with respect to x and the second with respect to y and add, we have 


Au = Uzz + Uyy = 0; 


in the same way, the imaginary part v satisfies the same equation 
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Av = Urz + Vyy = 0. 


In other words, the real part and the imaginary part of an analytic 
function are potential functions. 

If two potential functions u, v satisfy the Cauchy-Riemann equa- 
tions, v is said to be conjugate to u, and —u conjugate to v. 

This suggests that the theory of functions of a complex variable 
and potential theory in two dimensions are essentially equivalent to 
one another. 


d. The Converse of Cauchy’s Theorem 
Cauchy’s theorem has a valid converse (Morera’s theorem): 


If the integral of the continuous function ¢ = u + iv = f(z) around 
every closed curve C in its region of definition R vanishes, then f(z) 
is an analytic function in R. 

To prove this, we note that the integral 


F(z) = Í, , f(t) dt 


taken along any path joining a fixed point to and a variable point 
z is independent of the path. Then by (8.11c), p. 789, 


HES HAO _ Gay = o-oo a>. 


Hence, F(z) has the derivative F’(z) = f(z). F(z) is therefore analytic, 
and by our earlier result, so is its derivative f(z). 

The converse of Cauchy’s theorem shows that the postulate of 
differentiability could have been replaced by the postulate of inte- 
grability (i.e., that the line integral is independent of the path). The 
equivalence of these two postulates is a very characteristic feature 
of the theory of functions of a complex variable. 


e. Zeros, Poles, and Residues of an Analytic Function 


If the function f(z) vanishes at the point z = zo, the constant term 
in the Taylor series of the function in powers of z — Zo 


f(z) = f(20) + (z — zo) f’(zo) + ++ >, 


vanishes, and possibly other terms of the series also vanish. A factor 
(z — zo)” may then be taken out of the power series and we may write 
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f(z) = (z — 20)” g(2) 


where g(zo) # 0. A point zo for which this occurs is said to be a zero 
of the function f(z) of the nth order. 

The reciprocal 1/f(z) = q(z) of an analytic function, as we saw 
above, is also analytic, except at the points where f(z) vanishes. If zo 
is a zero of f(z) of the nth order, the function q(z) can be represented 
in the neighborhood of the point zo in the form 


2) = aye a = © Bo 
where A(z) is analytic in the neighborhood of z = zo. At the point z = 
zo the function q(z) ceases to be analytic. We call this point a singu- 
larity (singular point). In this particular case the singularity is called 
a pole of the function q(z) of the nth order. If we think of the function 
h(z) as expanded in powers of (z — zo) and then divided by (z — zo)” 
term by term, in the neighborhood of the pole we obtain an expansion 
of the form 


Q(z) = c-n(z — Zo” + ¢*+++e1(2— Zo)! + co + cı (z — zo) + °», 


where the coefficients of the powers of (z — zo) are denoted by c-n, 
., C-1, CO, Cl,. . 
If we are dealing with a pole of the first order (i.e., if n = 1), we 
obtain the coefficient c_1 immediately from the relation 


c_1 = lim (z — 20)q(2). 
z>z0 


Since 


1 _ f _ fle) — fzo) 


q(z)(z-—2) 2-2 2-2 


? 


we have for the coefficient of 1/(z — zo) in the expansion of q(z), 


1 
(8.21a) C1 = Fiz 
In the same way, if q(z) = r(z)/¢(z), where ¢(z) has a zero of the first 
order at z = Zo and r(zo) # 0, we have in the expansion of q(2) 


r(zo) 


g'(zo) 


(8.21b) | c-1= 
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If a function is defined and analytic everywhere in the neighbor- 
hood of a point zo but not at the point itself, its integral around a 
complete circle enclosing the point zo will in general not be zero. By 
Cauchy’s theorem, however, the integral is independent of the radius 
of this circle and in general has the same value for all closed curves 
C that form the boundary of a sufficiently small region enclosing the 
point zo. The value of the integral taken around the point in the 
positive sense is called the residue at the point. 

If the singularity is a pole of the nth order and if we integrate the 
expansion of the function, the integral of the series with positive in- 
dices is zero, as this power series is still analytic at the point zo. 

When integrated, the term c_1(z — 20)7! gives the value 2nic_i, while 
the terms with higher negative indices give 0, for the indefinite 
integral of (z — z0)-’ for v > 1 is (z — 20)~%*1/(1 — v), as in the real 
case, so that the integral around a closed curve vanishes. 

The residue of a function at a pole is therefore 2zic-_1. 

In the next section we shall become acquainted with the usefulness 
of this idea as expressed by the following theorem: 


THEOREM OF RESIDUES. If the function f(z) is analytic in the 
interior of a region R and on its boundary C except at a finite number 
of interior poles, the integral of the function taken around C in the 
positive sense is equal to the sum of the residues of the function at the 
poles enclosed by the boundary C. 

The proof follows at once from the statements above. 


Exercises 8.4 


1. Prove, without using the theory of power series directly, that the deriva- 
tive of an analytic function is differentiable by successive differentia- 
tion under the integral sign in Cauchy’s formula and justify the validity 
of this process. 

2. Show that the function 


1 ¢ © z 
fe) — zzi C—2cn dt, 


where the integral is taken around a simple contour enclosing the 
points € = 0 and ¢ = z, is a polynomial g(z) of degree n — 1 such that 


g™(0) = f™ 0) (m=0,1,...,n—1). 


3. Show that for every potential function u it is possible to construct a 
conjugate function v and to determine it uniquely apart from an additive 
constant provided the domain is simply connected. 

4. What are the residues of f(z) = (2z — 1)/(z? — 1) at its poles? 

5. If f(z) is bounded, |f(z)| < M, on the entire complex plane, show that 
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fe) -fO = S tOll 


can be made as small as one pleases by taking the integral over a suf- 
ficiently large circle. Consequently, f(z) = f (0); that is, the function is 
constant. 

6. Let f(z) be analytic for |z| < p. If M is the maximum of |f(z)|on the cir- 
cle |z| = p, then the coefficients of the power series for f, 


f (z) = 2) ayz“, 
v=0 
satisfy the inequality 


M 
| av | < oy” 
Note that the conclusion of Exercise 5 follows also from this result. 

7. Let P(z) = anz” + an-1z"-1 + - +--+ + «o be a polynomial of positive de- 
gree n. Show that the assumption that P(z) has no roots implies that 
f(z) = 1/P(z) is bounded and, hence, constant, by Exercise 5 or Exercise 
6, and, then, that f(z) is identically zero. This proves the fundamental 
theorem of algebra, that every polynomial of positive degree with com- 
plex coefficients has at least one complex root. 

8. Let f(z) be analytic in the interior of, and on, a simple closed curve C 
with the possible exception of a finite number of points in the interior. 
Consider 


1 f Og 


~ 2mi Je f(z) 

taken in the positive sense around C. 

(a) Show that if f has a zero of order n at « and no other poles or zeros 
in the interior of or on C, then I =n. 

(b) Show that iff has a pole of order m at « and no poles or zeros at any 
other point in or on C, then J = —m. 

(c) Show that if f has a finite number of zeros and poles in C, none on 
C, then Iis the number of zeros minus the number of poles, counting 
multiplicity; that is, if the zeros have multiplicities ni, ne,..., 
n; and the poles, multiplicities mi, m2, . . ., mx, then 


I=m+me+-+++nj—m—me—-+++— mk. 
9. (a) Two polynomials P(z) and Q(z) are such that at every point on a cer- 
tain closed contour C | 
| Q(z)|<|P()|. 


Prove that the equations P(z) = 0 and P(z) + Q(z) = 0 have the 
same numbers of roots within C. (Consider the family of functions 
P(z) + 9Q(z), where the parameter 0 varies from 0 to 1.) 


(b) Prove that all the roots of the equation 
2+az+1=0 
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lie within the circle |z| = rif 
1 
la|<r4—-., 
r 
10. Use Exercise 8(b) to show that a polynomial P(z) of degree n has pre- 
cisely n roots, counting multiplicity. 


11. (a) If f(z) has one simple root « within a closed curve C, prove that this 
root is given by 


o 1 f’(2) 
a = =z f z f(z) dz. 


(b) Interpret the integral of part (a) when f(z) has finitely many zeros 
and poles in, but not on, C. 
12. Prove that e? cannot vanish for any value of z. 


8.5 Applications to Complex Integration (Contour Integration) 


Cauchy’s theorem and the theorem of residues frequently enable 
us to evaluate real definite integrals by regarding these as integrals 
along the real axis of a complex plane and then simplifying the argu- 
ment by suitable modification of the path of integration.! In this way 
we sometimes obtain surprisingly elegant evaluations of apparently 
complicated definite integrals, without necessarily being able to 
calculate the corresponding indefinite integrals. We shall discuss 
some typical examples. 


a. Proof of the Formula 


sin x TT 
(8.22) Í -E ax = 5. 

Here we give the following instructive proof of this important 
formula, which we have already discussed by other methods (Volume 
I, p. 589; Volume II, p. 471). 

We integrate the function etz/z in the complex z-plane along the 
path C shown in Fig. 8.8, which consists of a semicircle Hr of radius 
È and a semicircle H, of radius r, both having their centers at the 
origin, and the two symmetrical intervals Jı and Is of the real axis. 
Since the function e/z is regular in the circular ring enclosed by 
these boundaries, the value of the integral in question is zero. Com- 
bining the integrals along Iı and Ie, we have 


1It is always necessary to reduce the integral considered to one over a closed path 
in the complex plane. 
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ix ` 


o r R, 


Figure 8.8 


iz iz R a 
f — dz + | < dz + 2i f sin X dx = 0. 
HR Z Hr Z r x 


We now let R tend to infinity. Then the integral along the semicircle 
Hr tends to zero, for if we put z = R(cos 0 + i sin 0) = Ret! for points 
of the semicircle, we have 


etz = eiRcos6 ek sin 6. 
and the integral becomes 


7 f etR cos 8 e-R sin 0 dé. 
0 


The absolute value of the factor et? ©98® ig 1, while the absolute value 
of the factor e-¥ si2@ ig less than 1 and, moreover, tends uniformly to 
zero as R tends to infinity, in every interval e < 0 < n — £. Hence, it 
follows at once that the integral along Hr tends to zero as R > oo. As 
the readercan easily prove for himself, the integral along the semi- 
circle H, tends to — ni asr — 0. The integral along the two symmetrical 
intervals lh, I2 of the real axis tends to 


2 f ME ds as R—-oo and r-0. 

0 x 

Combining these statements, we immediately obtain the relation (8.22). 
b. Proof of the Formula 

(8.23) f (cos ax)e~** dx = 1 Vm e a4 
0 2 


(Compare Section 4.12, p. 476 Exercise 9a.) 


We integrate the expression e~? along a rectangle ABB’A’ (Fig. 
8.9), in which the length of the vertical sides AA’, BB’ is a/2 and that 
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A Q B 


Figure 8.9 


of the horizontal sides AB, A’B’ is 2R. This integral has the value 
zero, by Cauchy’s theorem. On the vertical sides we have 


— —(42—y2) p- — -— 2 
|e-22| = | e7 (24-uy) e-2izy | = e-R?ey? < eR? eatld 


and this expression tends uniformly to zero as R tends to infinity. 
Thus, the portions of the whole integral that arise from the vertical 
sides tend to zero and if we carry out the passage tothe limit R — co 
and note that dz = d(x + lia) = dx, on A’B’ we may express the 
result of Cauchy’s theorem as follows: 


+æ co 
Í e- (z+ial2)? dx = f e-z? dx. 


That is, we can displace the path of integration of the infinite integral 
parallel to itself. By our previous result (see p. 415) the value of the 
integral on the right is yrn. The integral on the left immediately 


becomes 
e314 f e??? (cos ax — i sin ax)dx = 2e%*/4 Í, cos ax e`? dx, 


since sin ax is an odd function and cos ax an even function. This 
proves formula (8.23). 


c. Application of the Theorem of Residues to the Integration of 
Rational Functions 
For the rational function 


_ Qo + az +». -e + amz” 
QZ) =o t biz + >» + Bnet” 


if the denominator has no real zeros and its degree exceeds that of the 
numerator by at least two, the integral 
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I={ Q(x) dx 


can be evaluated in the following way: We begin by taking the 
integral along a contour consisting of the boundary of a semicircle 
H of large radius R (on which z = Re‘®, 0 < 0 < n) and the real axis 
from — R to + R. The radius R is chosen so large that all the zeros of 
the denominator lie in the interior of the circle. Consequently, all the 
poles of the Q(z) lie in the interior of the circle. On one hand, the 
integral is equal to the sum of the residues of Q(z) within the semi- 
circle, while, on the other, it is equal to the integral 


In = J Q(x) dex 


plus the integral along the semicircle H. By our assumptions, a fixed 
positive constant M exists such that for sufficiently large values of 
R we have! 


ROI Ta 


The length of the circumference of the semicircle is nR. By our 
estimation formula (8.11c) on p. 789, the integral along the semicircle 
is therefore less in absolute value than 

M _xM 


tR pa = R 


and, hence, tends to zero as R — œ. This means that the integral 
I= | Q@)dx 


is equal to the sum of the residues of Q(z) in the upper half-plane. 
We now apply this principle to some interesting special cases. 
We begin by taking 


1 1 
Ue) = Gat + bz +e FZ)’ 


1This follows immediately from the fact that Q(z) = (1/2?) R(z), where R(z) tends to 
zero as z — œ (when n > m + 2) or to dm/bn (when n = m + 2). 
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where the coefficients a, b, c are real and satisfy the conditions a > 0, 
b2 — 4ac < 0. The function Q(z) has one simple pole in the upper half- 
plane at the point 


zı = z {—b + iv4ac — b?}, 
where the square root is to be taken positive, in the upper half-plane. 
By the general rule (8.21a), therefore, the residue is 2ri [1/f’(z21)]. Since 
f'(z1) = 2az1 + b = i V4ac — b?, 
we have 


° 1 21 
(8240) J arrore = Vac — 8 
As a second example, we shall prove the formula (cf. Volume I, p. 
290) 


-} 
(8.24b) Í ; 2 = 5 v2, 


Here again, we can immediately apply our general principle. In 
the upper half-plane the function 1/(1 + z% = 1/f(z) has the two 
poles zı = £ = e!/4)ni, z2 = —e-1 (the two fourth roots of —1 that have 
a positive imaginary part). The sum of the residues is 


. 1 1 l/l 1 mi, _ 
ani l + Penh = 2844 (a3 + Za) = gE- 


= —Ti- isin $7 = n sin ~ = 
E 4 4 2 
as was asserted. 
The following proof of the formula 


-< d 2n)! 
(8.240) fava = BGs 


exemplifies the case where the residue at a pole of higher order has 
to be calculated. 

If we replace x by z, the denominator of the integrand is of the 
form (z + 1)"*1(z — i)"*1, and the integrand accordingly has a pole 
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of the (n + 1)-th order at the point z = +i. To find the residue at that 
point, we write 


1 l _ 1 1 
(22 + 1) f(z) (2— it (2i + z — im 
1 1. z—1\-"1 
~ (z— j)n+] (Qi)n+1 (2 + 2i a) 


If we expand the last factor by the binomial theorem, the term in 
(z — i)” has the coefficient 


L_y-n-t nut ++ 2n _ i Qn) 
ai | n ) = apa ( 1) cen  P(nNè' 


The coefficient c-1 in the series for the integrand in the neighborhood 
of the point z = i is therefore equal to 


1 1(2n)! 
nti į (n!)2 ` 


The residue 2ric-ı is therefore 


z (2n)! 
22n (n1)? ’ 


which proves the formula. 
As a further exercise the reader may prove for himself by the 
theory of residues that, 


(8.24d) f xsinX gy, 1 
0 


= neie! 
x? + c? 2 


(replacing sin x by e7). 


d. The Theorem of Residues and Linear Differential Equations 
with Constant Coefficients 


Let 
ao + aiz + a22? + +++ + anz” = P(2) 


be a polynomial of the nth degree and ¢ a real parameter. We think of 
the integral 
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(8.25) u(t) = f ue dz, 


taken along any closed path C in the z-plane, which does not pass 
through any of the zeros of P(z), as a function u(t) of the parameter t. 
Let f(z) be a constant or any polynomial in z, of a degree that we shall 
assume to be less than n. By the rules for differentiation under the 
integral sign, which hold unaltered for the complex plane, we can 
differentiate the expression u(t) once or repeatedly with respect to t. 
This differentiation with respect to t under the integral sign is 
equivalent to multiplication of the integrand by z, 2’, z3, . . ., asthe 
case may be. If we now form the differential expression L[u] = aou + 
aiu’ + azu” + +++ + anu™, or, in symbolic notation, P(D)u, where 
D denotes the symbol of differentiation D = d/dt, we have 


P(D)u = L{u] = Í „7 f(2) dz. 


By Cauchy’s theorem, the value of the complex integral on the 
right is 0; that is, the function u(t) is a solution of the differential 
equation L[u] = 0. If f(z) is any polynomial of the (n — 1)-th degree, 
this solution contains n arbitrary constants. We may accordingly 
expect to get in this way the most general solution of the linear 
differential equation with constant coefficients, L[u] = 0. 

In fact, we do obtain the solutions in the form that we already 
know (cf. Chapter 6, p. 696), on evaluating the integral by the theory 
of residues, with the assumption that the curve C encloses all the 
Zeros Z1, 22,..., Zn of the denominator P(z) = an(z — zı) (z — 22) 
e + e (Z — Zn). If we assume to begin with that all these zeros are 
simple zeros, they are simple poles of the integrand, and the residue 
at the point zv is by formula (8.21b) given by 


te) , 
2ni P any? 


By suitable choice of the polynomial f(z) the expressions f(zv)/P’(2v) 
can be made arbitrary constants; we accordingly obtain the solution 
in the form 


n 
u(t) = >) cve», 
v=1 


in agreement with our previous results. 
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If a zero z, of the polynomial P(z) is multiple, say r-fold, so that 
the corresponding pole of the integrand is of the rth order, the residue 
at the point z, must be determined by expanding the numerator 
etz f(z) = etzy et(z-z,) f(z) in powers of z — z,. We leave it to the reader to 
show that the residue at the point zv gives the solutions fev, .. ., 
itletzv as well as the solution ev, 


Exercises 8.5 


1. (a) Let f(z) be analytic and g(z) have a pole of order n at z = «. Obtain 
an expression for the residue of f(z)g(z) at z = «. 


(b) In particular, if g(z) = (z — «)-”, show that the residue is 


2Qri 


n—pil” ©. 
2. If f(z) has a zero of order 2 at «, show that the residue of 1/f(z) at « is 
B Ari f” a) 
3 fF" 


3. Evaluate, for nonnegative integers n, m with n > m, the following inte- 
grals: 


oo 2 
(a) i i+ x + x dx 
e 1 
© J ara” 


0 yem 
(c) Í oid xn dx. 


4. Let f(z) be a polynomial of degree n with the simple roots a1, «2,..., 
an. Prove that 


n yk 
T = k = l, -++,nm— 2). 
Hre’ #=0 n=2) 


k 
(Consider f Fe) dz around a closed curve enclosing all the av.) 


5. Derive the result of (8.24d), namely, 


pmm 1 Jel 
= Z nen! el, 
o x2?+c2 2 


8.6 Many-Valued Functions and Analytic Extension 


In defining functions both real and complex, we have hitherto 
always adopted the point of view that for each value of the independ- 
ent variable the value of the function must be unique. Even Cauchy’s 
theorem, for example, is based on the assumption that the function 
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can be defined uniquely in the region under consideration. All the 
same, many-valuedness often arises of necessity in the actual con- 
struction of functions, (e.g., in finding the inverse of a unique function 
such as the nth power). In the real case, we separated different one- 
valued branches of the inverse function in inversion processes such 
as yz or yz. We shall see, however, that in the complex case this 
separation is no longer reasonable, for the various one-valued 
branches are now interconnected in a way that makes any separation 
of them rather artificial. 

We must be content here with a very simple discussion based on 
typical examples. 

For instance, we consider the inverse 6 = yz of the function z = 
C2. To each nonzero value of z there correspond the two possible 
solutions € and —C of the equation z = C?. These two branches of the 
function are connected in the following way: Let z = re*®. If we then 
put € = yr e8? = f(z), ¢ = f(z) is certainly analytic in every simply 
connected region FR excluding the origin [where f(z) is no longer 
differentiable]. In such a region, ¢ is uniquely defined, by our previous 
statement. If, however, we let the point z move around the origin on 
a concentric circle K, say in the positive direction, € = yr e®/2 will vary 
continuously; the angle 0, however, will not return to its original 
value but will be increased by 2r. Hence, in this continuous extension 
when we come back to the point z, we no longer have the initial value 
C = vr @&®2, but the value yr e?9/2 e2zi/2 = —C, We say that when the 
function f(z) is continuously extended on the closed curve K it is not 
unique. 

The function %/z, where n is an integer, exhibits exactly the same 
behavior. Here every revolution multiplies the value of the function 
by the nth root of unity—namely, ¢ = e?*!/"—and the function only 
returns to its original value after n revolutions. 

In the case of the function log z, we saw (p. 795) that there is a 
similar many-valuedness, in that, in traveling once continuously 
around the origin in the positive sense, the value of log z is increased 
by 271. 

Again, the function z* is multiplied by e?ria per revolution. 

All these functions, although in the first instance uniquely defined 
in a region R, are found to be many-valued when we extend them 
continuously (as analytic functions) and return to the starting point 
by a certain closed path. This phenomenon of many-valuedness and 
the associated general theory of analytic extension cannot be in- 
vestigated in greater detail within the limits of this book. We merely 
point out that the uniqueness of the values of a function can theoreti- 
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cally be ensured by drawing certain lines in the z-plane that the path 
traced by zis not allowed to cross, or, as we say, by making cuts along 
certain lines. These cuts are so arranged that closed paths in the 
plane that lead to many-valuedness are no longer possible. 

For example, the function log z is made one-valued by cutting the 
z-plane along the negative real axis. The same applies to the function 
vz. The function V1 — z? becomes one-valued if we make a cut along 
the real axis between —1 and +1. 

Once the plane has been cut in this way, Cauchy’s theorem can 
at once be applied to these functions. We give a simple example by 
proving the formula 


+1 1 27 
(820 I=), GAIE = Tea 


where k is a constant that does not lie on the real axis between — 1 


and +1. 
We begin by noting that the function 


ee 
(z — k)Vv1 — 2 


is one-valued in the z-plane, provided we make a cut along the 
real axis from —1 to +1. If in the complex plane we approach this cut 
S first from above and then from below, we obtain equal and opposite 
values for the square root /1 — z2, say, positive from above and 
negative from below. We now take the complex integral 


carrer: 


along a path C as indicated in Fig. 8.10. By Cauchy’s theorem we 
can make this path contract round the cut without altering the value 
of the integral. The integral is therefore equal to the limiting value 
obtained when this contraction is made, which is obviously equal to 
2I. On the other hand, if we take the integral of the same integrand 
along the circumference of a circle K with radius R and center at the 
origin, this integral, by our previous investigations, tends to zero as 
R increases.! By the theorem of residues, however, the sum of the 
integrals along C and K is equal to the residue of the integrand at the 


1In fact, its value is actually zero, since by Cauchy’s theorem it is independent of the 
radius R, provided that the circle encloses the pole z = k. 
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Figure 8.10 


enclosed pole z = k; hence, 2I is equal to the residue in question. 
This residue is 


o. 1 1 _ 2n 
ani e-h Ae) Ve 


which proves our statement. 
Example of Analytic Extension: The Gamma Function 


In conclusion we give yet another example showing how an 
analytic function, originally defined in a part of the plane, can be 
extended beyond the original region of definition. We shall extend the 
gamma function, which was defined for x > 0 by the equation 


(8.28) [(z) = J, t2-le-t dt, 


analytically for x < 0 also. We could do this by means of the function- 
al equation 


r(e) = ET (e + 1), 


using this equation to define I(z — 1) when I(z) is known. By 
means of this equation, we can imagine I (z) as extended first to the 
strip —1 < x < 0 and subsequently extended to the next strip — 2 < 
x < —1, and so on. 
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We adopt another method, of greater theoretical interest, for 
extending the gamma function. We consider the path C in the ¢-plane 
indicated in Fig. 8.11, which surrounds the positive real axis of the 
t-plane and approaches this axis asymptotically on either side. We 
easily see from Cauchy’s theorem that the value of the loop-integral,! 


J t2-le-t dt, 
C 


is unaltered when the loop is made to contract into the x-axis. The 
integrand #—1e-* then tends to different values as we approach the 
x-axis from above and below, the values differing by the factor e?*, 


> 


Figure 8.11 Loop-integral for the gamma function. 


For x > 0, we thus obtain the formula 
— e2ani = —lp-t 
(1 — e@niz) T(z) Í o te-le dt. 


This formula is derived subject to the assumption that x, the real 
part of z, is positive. We see now, however, that the loop-integral has 
a meaning, no matter what the complex number z is, since it avoids 
the origin t = 0. This loop-integral therefore represents a function 
defined throughout the z-plane. We then define this function by 
stating that it is equal to (1 — e?*)I'(z) throughout the z-plane. The 
gamma function has thus been analytically extended to the whole 
of the z-plane, except the points x < 0 for which the factor (1 — e?**) 
vanishes, that is, except the points z = 0, z = —1, z = —2, andso on. 

For more detailed and extensive investigations the reader is 
referred to the literature of the theory of functions.’ 


Miscellaneous Exercises 8 


1. Write down the condition that three points 21, z2, 23 may lie ina straight 
line. 


1This is again an improper integral, which arises by a passage to a limit from an 
integral along a finite portion of C. The reader may satisfy himself that it exists by an 
argument similar to those previously employed. 

2For example L. V. Ahlfors, Complex Analysis, N. Y.: McGraw-Hill, 1953. 


CO q~- 0 Ol 


10. 


11. 


12. 
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. Show that three distinct point «, B, y of the complex plane form an 


isosceles triangle with vertex at y if and only if there exists a real 
positive k for which 


. Write down the condition that four points 21, 22, 23, z4 may lie on a cricle. 
. Let A, B, C, D in the 2-plane be four points in order on the circum- 


ference of a circle, with coordinates 21, Z2, z3, z4. Using these complex 
coordinates, show that AB-CD+ BC-AD=AC- BD. 


. Prove that the equation cos z = c can be solved for all values of c. 

. For which values of c has the equation tan z = c no solution? 

. For which values of z is (a) cos z, (b) sin z real? 

. Find the radius of convergence of the power series } an z”, where 


1 . wae 
(a) an = ne? S being a complex number with a positive real part 


(b) An = n” 


(c) an = log n. 


. Evaluate the integrals 


cos x 

(a) J, 1+ 1+ xi 

x? cos xy 

(b) J, 1+ xa a 
" cos x 

(c) 0 gtx 


dx 


© x21 
(d) Í, ee) G@ ED o for 1<a<2 
by complex integration. 
Find the poles and residues of the functions 
1 1 COS Z 


——, , T(z), cot z= =... 
sinz’ cosz sin z 


Show that if x and y are real 
|sinh(x + iy)|2 A(x), 


where A(x) is independent of y and tends to œ as x — +o. 
By integrating 1/[(z — w) sinh z] round a suitable sequence of 
contours, show that 


1 =z  (—1)" 
sinh w s+ 2w x w? + mên?’ 


Find the limiting value of the integral 
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13. 


14. 


15. 


cot rt 
i} ot r dt 
Cnt— z 
as n — œ, where the path of integration is a square Cn with its sides 
parallel to the axes at a distance n + 4 from the origin. Hence, using the 


theorem of residues, obtain the expression for cot rz in partial fractions. 
Using the equation 


z dt 
logt +a = Jo ite 


show that the power series for log (1 + z) converges everywhere on 
the unit circle|z|= 1, except at the point z = —1. By equating the 
imaginary part of the series to the imaginary part of log (1 + et9), 
establishes the truth of the Fourier series (cf. Volume I, p. 592) 


50 =sin® — 5sin20+ Z sin 30 — +- (—r<080<r). 


Prove that if f is analytic (d”/dx”) f(x) 1s equal to the result obtained 
by putting y and a each equal to yx in the expression for 


ar yf) 
dy™(y + a)rtt” 


(a) Prove that the series 
yvt1 


fe) = fx + ny = EN 


converges for x > 0. 

(b) Prove that this series provides an extension of the zeta function 
(defined in Exercise 5, p. 797) to values of z such that 0 < x <1, 
by means of the formula 


f(z) = 1 — 21-3% (2), 


which is valid for x > 1. 
(c) Prove that the zeta function has a pole of residue 1 at z = 1. 


Solutions 


Exercises 1.1 (p. 10) 


1. (a) Write z =r (cos 0 + i sin 9), in polar form with 0 < 0 < 2r. Then, 
by De Moivre’s theorem (Volume I, p. 105), 


z” = r"(cos nð + isin n9). 
For r < 1, we have lim r” = 0; therefore, lim z” =0. For r > 1, we 


n — o noo 
have lim r” = oo; therefore, the distance of z” from the origin, hence 


nro 

from any given point, can be made arbitrarily large and the sequence 
diverges. For r= 1, there are two cases: z = 1 (0 = 0) for which 
lim z” = 1, and z = cos 0+ i sin 9. In the latter case, we have 
n-o 

for the distance between two successive points of the sequence 

jen+! — zm|=|2"| e |z — 1|=|z— 1| 
= 2 — 2 cos 0, 


a fixed positive value; by the Cauchy test the sequence must then 
diverge. 
(b) The primitive nth root of z is given in polar form by 


e) .. 8 
gin = r/n {cos —+isin—]. 
n n 


If z = 0, we have lim 21” = 0. Otherwise, we have on setting z!/" = 
n= 


Xn + Lyn, 


lim 21/" = lim xn + i lim yn 


= lim r!/" cos 8 + i lim r1” sin” = 1, 

n=% n nco n 

2. Apply the limit theorems of Volume I to the components of Pn separate- 
ly. 

3. For a point (a, b) satisfying a? + b? < 1, set « = Va? + 62. The neigh- 
borhood (x — a)? + (y — b} < (1 — a)? of (a, b) is contained in the disk. 

For a point (a, b) satisfying a? + b? = 1, every neighborhood contains 

points not in the disk. 

4, Let (a, b) be any point of S. Put y = b — a? > 0. Consider an ¢-neighbor- 
hood of (a, b), 


(x — a)? + (y — b} < æ. 


For all points of the neighborhood, we have |x —a|<«, |y— b| <«e. 
Using 


821 
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5. 


a? = x? — 2(x — aja — (x — a}, 
we obtain 
y>ob—ec=ae+y-—e 
= x? — X(x —a)a — (x — a} + y— e 
> x? + y — 2e|a|— e? — e > x? 


provided ¢ is taken as the smaller of 1 or y/(2|a|-+ 2). Thus the e-neighbor- 
hood is in S. 

The segment (together with its end points if these are not considered as 
points of the segment). 


Problems 1.1 (p. 11) 


1. 


4. 


By definition, every neighborhood of the boundary point P contains 
points of S. Choose Pı in S so that P,P < 1/2. Since P is not in S, Pı + 
P, and therefore, PP > 0. Now proceed by induction: given Pn choose 
Pn+i in S so that PpP < 4 P,P. Clearly, the Pn are distinct and P,P 
< 1/2". 


. Let S be the given set; Sc, the closure of S; and Sec, the closure of Se. 


Every point of Sec is either in Se or the boundary of Se. If P is in the 
boundary of Se, then every neighborhood of P contains at least one 
point Q of Se and one point R not in Se. Since Fis not in Se, it is not in 
S. Since a neighborhood is open, the neighborhood of P contains a 
neighborhood of Q that must contain a point of S. Thus P is in Se. 


. Let X be any point of S on PQ. The set of values of PX is bounded, since 


PX < PQ. Let R be the point on PQ at distance equal to lub PX from P. 
Any neighborhood of R contains points of PQ that are in S and points 
that are not in S. 

All points of G are interior points. 


Exercises 1.2 (p. 16) 


1. (a) 4 


1 
©) Gog we 
(e) 5. 


2. The domain is the set of points (x, y) and the range, the set of values u, 


where 
(a) y>o—x,u>o0 (jj) x=y=0,u=0 
(c) y>—x,u>0 (k) |y|<|x|, u real 


(e) y>—, u real D (9) #@,0,0<u<t 
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(2) x+y? +22<a2,0<w<a (m) y+—x,—7<u<Ž 


2 2 
(h) y # —x, u real (m) x#0,0<u<l 
G) x? + 2y? <3,0<u< V3 (o) Lcs+y<e0<u<rT 
(p) ann — > <x < Inn +5 and y>0O,or 


ann +7 <x < Wan and y<0.u2>0. 


3. For k variables, 
ma (n +1) (n+ 2)+-(n + k). 


(Compare Volume I, Chapter 1, p. 117, Exercise 11.) 


Exercises 1.3 (p. 24) 


2. Discontinuous at x = y =Q. 
3. (a) Set x = p cos 9, y = ep sin 9. Then 


| f (x, y)| = p?|cos 96 — 3 cos 6 sin?0|< 4p3. 


Take è(e) = 3Vef4. f(x, y) has at least the order of p3. 


4. As in the theory of functions of one real variable, sums and products 
of continuous functions and continuous functions of continuous 
functions are continuous. 


(a) Continuous. 
(b) Discontinuity possible only at (0, 0). Note with x =o cos 0, y= 
e sin 9 from|sin «|< |a], that 
sin xy 
Vx? + y? 
hence, the limit at (0, 0) exists and is 0. 


5. Use the mean value theorem of the differential calculus to obtain for 
z>0,z+h>0 


<P; 


[h| h| 


= 2/i+ ætas 2? 


hence, it is sufficient with appropriate choice of z in each case to require 
|h| << 2e. Set Ax = p cos 9, Ay = ep sin 9, where p < 8 (e, x, y) 


(a) With z = x? + 2y? and h = Az note that 
| Az| = e|2x cos 0 + 4y sin 0 + 9 (cos? 6 + 2 sin? 6| 
< p (2|x|+ 4|¥| +39) < p (2|x| + 4|y|+ 3), 


where we impose 8 < 1. For|Az|< 2e, it is sufficient to require 


Vi+t+(z+h)— vV1l+2 
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. 2e 
ò < min iTe 1). 
2|x|+ 4|y|+3 

. On the lines y = + x. 
. On the lines x =n+4,y=n+4+3. 
8. For all values. (By definition, a function is continuous in the exterior 

of its domain.) 
9. Set z = 1/u where u = 1 — x? — y?, |Az|=|Au|/(u + 0 Au). For u > 0, 

choose|Au|< 2/u. Then u + 0 Au > u/2 and 

4|Au| 


[az] < — 


~q O 


Now, with Ax = p cos 9, Ay = p sin 9,ọ <8 < land |x], |y|< 1, 
|au| = |p (2x cos 0 + 2y sin 0) + ẹ?| 
<p (2|x|+ 2|y|+ 1) < 58. 
Therefore, to enforce |z|< «, take 
5 = min E (1 —x? — y2)s 1}. 
11. With x = ọ cos 9, y = ọpọ sin 9, we have 
P = p2 (a cos?6 + 26 cos 9 sin 9 + c sin?6) 
= p? f (0). 
The expression f (8) must not vanish for any value of 8. Thus we must 
have ac — b? > 0. 


12. All discontiuous, (a) on line x = 0, (c) on line y = — x. 

13. For the approach along a straight line set x =p cos 0, y = pọ sin 0 
with 9 fixed. To show discontinuity for f(x, y), approach along the 
parabola, x = ay? with arbitrary a, for g(x, y), along the circle (x --- 4)? 

+ y? = 4. 

14, For (e) and (g) limits exist. For (h), set y = e~@/'*! with arbitrary 

positive « and show for 


f(x, y) = lel Vx? ty? 


2 ope + | 
Vx +y? + B 
that lim f(x, e7a'!z1) =e7a, 
x-0 
15. For Exercise 14(e), 
—_—___ il 
3 (©) = V2 loge 
For Exercise 14(g), 
— min (— 1982 1 
ò = min | log e’ z|- 


16. First set x = y = 0, then set z = 0. 


17. 


18. 


19. 


20. 
21. 


23. 
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Follows since R(x, y) is not defined at the origin and the origin is a 
boundary point of the domain of R. 
(a) 1 
(b) 0 
(c) 0. 
Set y = mx. Then lim z=3(1-—m)/1+™m). 
r= 


Compare Exercise 13. 

Approach along straight lines other than x = 0 yields the limiting value 
0. Approach along the curve y = a/log x yields the arbitrary limiting 
value a. 

¢ maps the part of its domain within any circle of sufficiently small 
radius pọ about the origin into an interval of radius Cp centered at 0, 
where the constant C may be fixed independently of p. 


Problems 1.3 (p. 26) 


1. 


Let S be the domain of f, S* the domain of f*. If Q is an interior point of 
S, then there exists a neighborhood of Q entirely within S and continuity 
for f* is identical with continuity for f. If Q in S* is a boundary point of 
S, then whether or not Q is in S, there exists a 5-neighborhood of Q 
wherein | f(P) — f*(Q)| < e/2. For any point Q of S* in the $-neighbor- 
hood of Q but not in S, there are points P in S for which f(P) is 
arbitrarily close to f* (Q), say | f(P) -f *(Q)| <e«/2. It follows that 


| £*(Q) — F*(Q)|<«. 
x, y)>(&N) 


2. If lim f(x, y)=L and lim (xn, yn) = (É, n), then for any positive 
oY Noo 


e there is a ò such that | f(x, y) — L| < e whenever (x, y) lies within the 
-neighborhood of (č, n). Furthermore, there is an N such that (xn, yn) 
lies within the -neighborhood of (, n) for n > N. For n> WN, then, 
| f (xn, yn) — L| < E. 
Conversely, suppose for every sequence of points (Xn, yn) in the 
domain of f with limit (&, n), we have lim f(xn, yn) = L. If f did not have 
n> 


the limit L at (&, n), then for some £ > 0 and for all è > 0, there exists 
a point (x, y) # (&, n) in the -neighborhood of (&, n) for which | f(x, y) — L| 
>e. Set 51 = 1 and choose (x1, yı) in the $1-neighborhood of (&, n) so 
that|f (xı, yı) — L| > e. Define ðn and (xn, yn) sequentially by 8: = 
Wlenia — 22 F Ona — a and Vlen — BP F (yn — 1)? < bn With |f (£n, 
Yn) — L| > e. In this way, a sequence (xn, Yn) is constructed that violates 
the hypotheses if f does not have the limit L at (&, n). 


Exercises 1.4a (p. 30) 


l. 


(a) 2 = nax" t; A = mbym-1, 
(c) s, 2x? — 3y?, Oz _ 3y? — 2x? 
l ax xy ° ay xy? 
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Oz _ 3/2. dz_ 3 2 ay1/2 
(e) gy T Y ; ay 2 . 


ðz y8!4 ðz _ 83x12 


(8) ax 2x12?’ ay  Ayyll4® 
G) % = —2x sin (x? +); 9% = — sin (x? + 3). 
Ox > Oy 
dz =—s sin x, 02 cos x cos y 
q Fa SBE Z SOS EY 
Ox siny oy sin*y 
Oz 2x? 2 «02 x 
m 2% 22+, dz y 


ax Vey ay Ve toe 


2. (a) Of_ 2x _ . Of WY 
dx B(x? + y2)23’ ðy 8x? + y2)23 


9 


Of L az-y. f — erv 
(c) ax S dy e 


(e) ef = yz COS XZ; 5 = sin xz; pi = xy COS XZ. 
of af af arf 3f 
‘ Ly, L= — z= — z= 0; =], 


_ x+y, 
(c) Use f(x, y) = T— xy’ 


of i+y? , of _ 14+ 


3x (1—xy)?’ dy (1— xy)? 
af Ay +y. af _2(x+y), Af _ Ux + x’) 


— 


ax? (1— xy) ðxðy (1— xy) dy? (1— xy)? 
of 


(e) ot = yx) e(2%); ay = xv el”) log x. 


2 
of yxy- ea” (y — 1 + yx"); 


Ax? 
92 
— L = xv-1 ef) (1 + y log x + yx” log x); 
ax Oy 
f — xv (log x)? e” (1 + x”). 
dy? 
4. fz = 0, fy = 0, fz = —3. 
5. 1. 
8. (2/r). 


9. a= —3. 
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Problems 1.4a (p. 31) 


1. (” H ‘). (Compare Exercises 1.2, number 3.) 


2. Consider a function of the form f(x, y) = «(x)6(y) where « is differen- 
tiable and 6 is not. 


3. Differentiate with respect to x and y to obtain for all x and y, 
; y(x) y’(y) 
2 2) — YY) — Vy) . 
G(x? + y) = a YO) By p(x); 
whence, ’(x)/2x)(x) is constant. f(x, y) = ces? + 9”), 
Exercises 1.4c (p. 36) 


2. (a) Observe that the first partial derivatives, 


2x 
af _ lza aa €P [~ 1 + y], xy #0 
Ae (x? + y?) 
’ x=y=0 
2y 2 2 
Of la y XP [—1/(? + y*)], x,» #0 


0, x=y=0, 
are bounded. 
(b) The origin is the only point in question. Consider 


x4 + y4 
af _ Pst log (x? + y”), x,y #0 
x 


ð 
x=y=0Q, 


in the neighborhood x? + y2? < 82. Then 
Of — 933 4. 832 
ax < 283 + 8828 log è| 


< 1082, 
for è < 1, where we have used|è log è| < 1, for è < 1. 


Exercises 1.4d (p. 39) 


1. (a) 2ab 
(c) ab f”(ax + by) 


1 
© -Fy 


2. (b) fz = y sinh xy, fy = x sinh xy, frs = y? cosh xy, 


fzy = xy cosh xy + sinh xy, fyy = x? cosh xy, 
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fzzz = y? sinh xy, ferry = xy? sinh xy + 2y cosh xy, 
zyy = x*y sinh xy + 2x cosh xy, fyyy = x? sinh xy. 
(d) fz = 1/y — y/x*, fy =1fx — xy, fea = 2y/x?, 
fay = (— 1/x?) — 1/y?, fuy = 2x/y?, fare = — 6ylxt, frry = 2/23, 
fayy = 2/y*,  fuvy = — 6x/y*. 


Problems 1.4d (p. 39) 


1. (b) Set z = log u. Then zzy = 0. Thus zz does not depend on y. Set 
Zz = a(x); then, 


z = falx) dx + Hy) = $x) + Y0); 
whence, 


u = e? = epl?) ey), 


Exercises 1.5a (p. 42) 


1. (a), (b) f=(0, 0) does not exist. 


(c) Set h =p cos 8, k= pọ sin 9. For differentiability it would be 
necessary that 


f(h, k) — f (0, 0) = p sin 29 = f2(0, O)h + fy(0, 0)R + o(p), 
but fz(0, 0) = f,(0, 0) = 0, a contradiction. 


2. For s between x and x +81, t between y and y + 82, we have | g(s) — g(x)| 
< e1(81), | A(t) — h(y)|< e2(82) where lim €1(81) = lim e2(S2) = 0. Con- 
517 $27 


sequently, by the mean value theorem of integral calculus, 
+8 zx 
i 1 g(s) ds = f no © ds + dig(&) 


where |g(&) — g(x)|<«1(81); a similar result holds for h(t). It follows 
that 


fx +81, y + 8) =| JP gC) ds + g(a) + 081), 
+ | fA) dt + Bah(y) + o(82)| 


= f(x. y) + 81g(x) + d2h(y) + 0(V812 + 822). 


Problems 1.5a (p. 43) 


1. Set ep = Vh? + k2. Then 
| f(x, y) — F(a, b)| < e(| fala, b)| + | Fula, 5)| + £), 
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where lim £ = 0. Thus, f is not only continuous, but Lipschitz con- 


P+0 


tinuous: for P = (x, y), A = (a, b), we have in some neighborhood of A, 


| f(P) — f(A)| < M|P — A|, where M is constant. 


Exercises 1.5b (p. 45) 


l. 


av3 +b a+ 
. (a) a, 9 , 


The slope of the section of the surface z = f (x, y) with the plane 
arc tan[(y — yo)/(x — xo)] = «; that is, the slope in the z, p-plane of the 


curve z = ¢(p) = f(x + pẹ cos a, y +e sin a). 
by 3 
5 , b. 


(c) 2, V3 — 2, 1 — 2/73, — 4, 


9? 
(g) 0, 0, 0, 0. 

. (a) — 8/5 
(b) —1 
(e) — 2/73. 


4. f (x, y) = xy/(x? + y?). 
6. 02f/dr2 = sin 20. 


Exercises 1.5c (p. 48) 


1. 


(a) z= 8y—4 
(c) 3x + 3y — 4z + 5 — 3 logg 2 = 0 
(e) z=[exp(1l/V2)/V21@—y+vV2 +74) 


(g) z = 2e-2(x + y + 3 e? f e- dt — 2). 


. The common point is the origin. 
. The equation of the plane through the three points can be put in the form 


zZz — 20 = 


(x — xo) [ki(z2 — 20) — kz(zı — zo)] + (y — yo] [h2(zı — zo) —hi(ze — 20)] 


heki — hike 


9 


where hi = xi — xo, ki = y — yo, for i = 1,2. Set hi = picosau, ki = pi sin ai. 
Then zi —zo = pil(cos «:)(02/0x) + (sin «)(0z/0y)] + o(p:). Enter this in 
the equation of the plane with sin («1 — «2) + 0, and(x, y) fixed to ob- 


tain the desired result, 


E dz 
z — Zo = (x — xo) Jx + (y — yo) ay 


dz 4 olen) 5 ol) 
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4. We may suppose not all coefficients vanish, say c + 0. Then (xo, yo, Zo) 
lies on one of the surfaces 


z= + H 
c 


The tangent plane has the equation 
z — 20 = (x — xo) Zz (xo, yo) + (y — Yo) Zy(xo, yo). 


Differentiate the equation for the quadric surface to obtain 


2axo + 2czo oz =0 
Ox 
2byo + 2cz oz _ 0 
YO 0 ay 
and insert the values for Z and se in the equation for the tangent 
plane to obtain (if zo # 0), 
gy = — By — x) — OY 
z=- 2 = — (x — xo) czo (y — yo), 


whence 


axox + byoy + cz0z = axo? + byo? + czo? = 1. 


Exercises 1.5d (p. 51) 


1. (a) (2xy? + 3y3) dx + (2x2y + Oxy? — By?) dy. 
(c) 4x3 dx — 3y? dy/(x* — y’). 
(e) —(dx + y7! dy) sin (x + log y). 
(g) dx + dy/(1 + (x + y)?). 
(i) (dx + dy — dz) sinh (x + y — 2). 
2. (—2/10) + (7 7/5/25) 


3. ex2+ul(Qx3 + 12x) dx? + (8x2y + 4y) dx? dy + (8xy? + 4x) dx dy? 
+ (8y3 + 12y) dy?]. 


Exercises 1.5e (p. 53) 


1. z varies from —3 to — 3.5. 


1 
2. — 600° 
3. 1/2 (y|h| + x|k|). 
4, From dz = y dx + x dy, dz/z = dx/x + dyļy. 
5. From dg = 2dx/t? — 4x dt/t3, the relative error in g is dg/g = dx|/x—2dt/t. 
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Thus a given relative error in the measurement of t will have twice 
the effect of the same relative error in the measurement of x. 


Exercises 1.6a (p. 57) 


1. (a) 


(e) 


2. (a) 


(b) 


(c) 


(d) 


2 
Zxr = —?2x log (1 + y), Zy = -I Fy 222 = — 2 log (1 + y), 
ye at yy Oy) 


Set u = x, v = arc tan y, Zz = v sec?(uv), zy = [sec?(uv)]/(1 + y?), 
Zza = 2v2 sec?(uv) tan (uv), Zzy = [sec2(uv)/(1 + y»? [1 + 2v tan (uv)], 
Zyy = x sec®(uv)/(1 + y?)? [x tan(uv) — 2y]. 
w, = — x — y cos Z 
(x2 + y2 + 2xy cos z)3/2’ 


w, = — y — x COS Z 
Y (x2 + y2 + 2xy cos z)???’ 


w, = xy sin Z 
Z (x2 + y2 + 2xy cos z)?’2 ` 
v= 1 
t D DTN 
V22 + 22y? + yt — x?’ 
Wy = —— 
4 (z + YWZ + Qzy? + yt — x2’ 
w, = ——————— 
2 (z + YW + 2zy? + yt — x2 ` 
Wr = 2x + 2xy 


1+ x2 + y2 4 22’ 
2y? 


Wy = l 1 2 2 2 I 


w, = 2yz 
1+ x? + y? az 
_ 1 
Wr = — OOOO 
2(1 + x + yzWx + yz 
z 
Wy = 4, 
“A1 + «+ yzWyx + yz 
Wz 


=— J 
2(1 + x + yz\Wx + yz 


3. (a) Consider the derivative of z = u” where u and v are functions of x: 


dz _ „uv: au du 
da UU dx T X” log u dx’ 
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Employ this formula for u = x, v = x to obtain 
£ (x7) = x*(1 + log x). 
Now employ the formula again for u = x, v = x* to obtain 
Í (x2) = ga ya E + log x + (log | | 
(b) Set y = 1/x. Then 


dz_ _ l dz 
dx x? dy 
Use z = (yY)! = u”, where u = y, v = y? to obtain 


= = yv?+) (1 + 2 log y) = yz(1 + 2 log y), 
whence, 
dz _ 2 log x —1 


dx —s_ x8 lla 


4. See Problem 1. 
5. Use the symmetry in the several variables and calculate in each case: 


(a) fer = 2 * 


2x2 — y2 — 2? 
(b) Erz = ~ (x? 4 + y2)? 9 

_ 6x? — 2y? — 222 — 2w? 
O tae = 2+ y 


Problems 1.6a (p. 58) 


1. Use the Cauchy-Riemann equations in 
rz + yy = (uz? + Uy) fun + 2(UzVz + UyVy)fuv + (vz? + Vy? fov 
+ (Uzz + Uyy)fu + (Vez + Vyy)fo, 


and note that u and v are also solutions of Laplace’s equation. 


. Let the vertex of the cone be located at the origin (no loss of generality 
is entailed since a translation of axes will not affect the derivatives of 
f). If a point (x, y, z) lies on the cone, then so also does the point (Ax, Ay, 
Az) where A is any real number. We therefore have 

Z_f(* Y\ yy) — g([Y\. 

z = F(Z, 2) =F(1,2) = 9 |; 
thus the equation of the cone can be written in terms of a function ¢ 
of one real variable: 
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y 
z= Z]. 


The result follows on differentiation. 
3. (a) Err + 2 gr. 


(b) From g7r/gr = —2/r, obtain log gr = —2 log r + constant, etc. 


4. (a) Err + not Er. 


(b) Ifn = 1,ar + b. 
If n = 2, a log r + b. 
If n > 2, a/r™-2 + b (compare Problem 3). 


Exercises 1.6c (p. 63) 


l. Vur? + (1/r?) ue?. 

2. Set u = f(x, y) and introduce new variables by § = x cos 9+ y sin 9, 
7 = ycos 8 — xsin@. Obtain uzz = cos? 9 ugg — 2 cos Osinð wen + sin?ð Unn, 
Uyy = sin?0 wee + 2 cos 0 sin 9 ugn + cos? 8 Unn. 

4. Zz = 8, Zy = l, Zr = Zz cos 9+ zysin 9, ze = — zzrsin 9 + zyrcos 9. 

5. Note that the derivatives do not depend on a and b. The transformation 
is essentially a rotation and translation of the x, y-axes. Compare 
Exercise 2 and 3. Use 


Ure = «a Uee — 2aBUEn + B Unn, 
Uszy = «BUee + (a? — B?) Ven — «BUnn, 
Uyy = P Uzt + 2aBUEn + «7? Un. 

For a geometrical interpretation see 1.6 a, Problem 2. 


3 2 
6. ~~ T: + Tan + Ê Tze + Č Tee. 
2x2 x x2 


Problems 1.6c (p. 64) 


19 (ou), 1 du, a (dw 
‘ror ( an + sin 0 0¢2 + du (sine | l 
To compare with 1.6 a, Problem 3, let derivatives of u with respect to 0 
and ¢ vanish. 
2. Under the given transformation, the equation Afzz + 2Bfzy + Cfyy = 0 
is transformed into A*fee + 2B*fen + C*fnn = 0, where 
A* = a?A + 2abB + b?C 
B* = acA + (ad + bc)B + bdC 
C* = c2A + 2cdB + d?C 


834 Introduction to Calculus and Analysis, Vol. II 


(compare Exercise 3). Observe that 
B*2 — A*C* = (ad — bc)? (B2 — AC). 


Thus, the sign of B*2? — A*C* is independent of the linear transfor- 


mation. It follows that no such transformation exists for (a) if 
B? — AC 2 0 or for (b) if B? — AC <0. 


(a) Assume B? — AC <0, and set A*=1, B*=0, C*=1 above. 
Observe from AC > B? > 0 that A and C have the same nonzero 
sign, which we may assume to be positive. If B = 0, take b = c = 0, 
a= 1//A, d = 1/VC. If B #0, first reduce to the case B= 0, 
for example, by taking 

l c= _ B d= _ -A 

VA’ — VA(AC — B?)’ ~ VA(AC — BJ ` 

(b) Assume B? — AC > 0 and set A* = C* = 0, B* = 1 above. If B= 
0, then A and C have opposite signs. In that case, satisfy the equations 


geV-$, =J, maen 


b=0, a= 


for example, take 
_ _/|_A _l i C 
a=l, b=/—4, c= =9 ZA: 


If B + 0 and at least one of A or C is nonvanishing, say A > 0, first 
reduce to the case B = 0, for example, by taking A* = A, C* = 
—1/A, b = 0, then 


1 B 


a=1, d=- AC CTT /A(B? — AC)’ 


Exercises 1.7a (p. 66) 


1. (a) (h + k) cos (x + h + y + k). 


_hgy +k) k 
©) (Get he rth 


2. (a) -5 


(b) 3 e516, 


(c) 3. 


Exercises 1.7b (p. 68) 


1. For a curve defined by the intersection with the surface z = f(x, y) of a 
vertical plane h(n — y) — R(E — x) = 0 through the point (x, y), there exists 
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a tangent at some interior point of any arc that is parallel to the chord 
joining the end points. 


2. (a) . 

(b) > arc sins 2 v2 . 
3. Take x = 0,y = — 5, h=k=}. 
5. (a) 3 

(b) = 


Problems 1.7b (p. 68) 


1. It is sufficient to prove that f has the same value for any two points 
that can be connected by a segment within the domain. 


Exercises 1.7c (p. 70) 


1. xy. 
2. Observe that df vanishes at (2, 3) for h = 0.1, k = — 0.1. Thus, approxi- 
mately, f(2.1, 2.9) = f (2, 3) + 4d?f(2, 3) = 79.9. 
3. The approximation is exact. The error is zero to all orders. 
4. (a) x? — 2x?y + y? + h(Bx? — 4xy) + R(Qy — 2x?) + h2(8x — 2y) — hk4x 
+ k? + 6h? — 2h2k. 


(— 1)"(h + 2k)?” 
(o) È, Qn—-1)! ' 


(c) The cases x + h>0, x + h <0 must be taken separately; the two 
cases yield different first order terms in A: 


xty — 2y2x — /3|x|+h(4x3y — 2y2 — V3 sgn(x + h) 
+ k(x4 — 4yx) + h?6x2y + hk4x? — k?2x + h?4xy 
+ k?k6x? — 2hk? + hty + 4hëk + h4k. 
5. x + x(y — 1) — 2x(2 + 1) — 2x(y — 1) (z + 1) + 2x(z + 1} 
+ x(y —1) (z +1}. 


3 5 
6. (a) Rig teas we 


x x n zy yp 
(b) y+ = 47 e+ ee + t 490° 
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(c) 1y hE 4S 
@1t2e+e-¥ 42D 4 
(f) my tO EM Dye. 


4 4 
@ 1+ —y +S —xyt totes 


O +y- FX _ FY _Y 


7. Observe that the error is fourth order. To fourth order 


2 — y2 4 242 4 
cosx q _ 38y? xt bxy +O, 
cos y 2 24 . 
for the fourth-order term we have 
xt — 6x2y2 + By4 _ (y2 — x2) (By? — x?) 
24 24 ` 


For |x| < 7/6, |y|< 7/6 the two factors reach their maxima at x = 0, 
y = 7/6. Thus, we estimate the error as about 


Problems 1.7c (p. 70) 
oo n fs) © 
l(a) DD [æy = 2 (” + "my 
n=0 r=0 \T n=0 m=0 n 
converges in the strip |x + y|< 1. 


oo n xT nr oo oo xm n 
b) È = 7 7 2 ; 
n=0 r=1 T! (n— r)!  n=0 m=om!n! 


converges for all values of x and y. 
2. Expand both sides of the spherical formula to second order in x, y, and z. 
3. Expand f (2h, e-!/2") and f(0, 0) to second order in the neighborhood of 
(h, e-1/*); add and divide by h?. 
4, Convergence follows by convergence of the expansion of the exponential 
function for one variable. Differentiate with respect to x to obtain 


Solutions 887 


2Hn-1(x)y" 
1 (n—1)! 
whence (b) follows on equating coefficients. From (b) and Ho(x) = 1, (a) 
follows inductively. To obtain (c), differentiate with respect to y and 
equate coefficients. To obtain (d), use (b) to replace 2nHn-: in (c) by Hn’ 
and then differentiate to obtain 
Ani — 2xHrn’ + 2H,’ + Hr” = 0. 


Next use (b) in this result to replace Hn+1’ by 2(n + 1) Hn. 


Ms 


oo H. / n 
2yf (x, y) = pe = 


Exercises 1.8b (p. 80) 


1. Use the uniform continuity of Bx(x, k) for x in the closed interval 
a < x < band k restricted to any closed subinterval of ko < k < kı. 


2. (a) For s = k-? and 1 — e < x < 1, we have for large k 
k log x = k(x — 1) + 0(k713) 


x— 1 
log x 


= 1 + 0(k-?/), 


hence 


x(x — 1) _ se- -1/3 
ED L eked (1 + 00-19), 


while forO<x<l-—e 
x(x — 1)_ 0 (F — 1 ene). 
log x 
It follows that 


FR) =f. +f. = i t+ 04%), 


0 
(b) By Ex. 1, 


1 


k) = x(x — —~ it __ 1 
F’(k) Jake 1) dx L42 kF 


Hence F(k) = log = H r + c, where the value of the constant c turns 


out to be 0 from (a). 
Exercises 1.9b (p. 92) 
2x . . 
1. (a) J, (—t sin t + cos? t + sin t) dt = 3r 


(b) f? (2x0 — 2txoyo(1 — t?) + yo(1 — t?)) dt = — s(x — yo). 
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Exercises 2.1 (p. 141) 


1. If X = (x, y, z) is an arbitrary point of the line, then 
— te 
PX =A, 
where A may be any real number. Thus, 
(x + 2, ¥,2— 4) = A(2, 1, 3), 


or 


bo 

| 

<< 
oo 


2. Set PQ = A. Any point X of the line satisfies PX = A. Let B, C, and 
V be the position vectors of P, Q, and X, respectively. Then, 
PX =V—B=)A=XC—B); 
or 
V=(1—AB+ aC 
In particular, if P = (3, — 2, 2) and Q = (6, — 5, 4), as given in (a), 
(x, y, z) = AB, — 3, 2), 
or 


XY% 


3 3 2 


3. If V is the position vector of any point X on the line joining P to Q, 
then, by the solution to Exercise 2, 


V=(1—AaAA+2B. 
for some real A. Thus, 
(1 — a) (V — A) =A (B —V) = (1 — ))a (B — A). 
If0 << 1, it follows that V — A, B — V and B — A have the same 
direction and |V — A|/|B— V|=aA/(1 — 4) 
4. Write the position vector in the form 


V = A + XB — A), 


n 
where B — A is represented by PQ, to see that à > 0. 


5. Let A, B, C, D, E be the position vectors of the points P, Q, R, S, M, 
respectively. Take the origin O at the point dividing MS in the ratio 
1/3. Thus, D = —3E. Since E = 1/3 (A + B + ©), it follows that 


F(A +B+C+D)=0. 


Hence, O is the center of mass by the general definition and clearly 
does not depend on the order of the vertices. 


11. 


12. 


13. 


Solutions 839 


. Let the edges be PQ and RS; in the notation of the preceding solution 


their midpoints have position vectors (A + B) and 3(C +D), respec- 
tively. From the solution to Exercise 5, (A + B) = — + (C + D); 
hence, the midpoints are collinear with the center of mass O and equi- 
distant from it. 


. If Pk = (xr, Yr, Zk), for k = 1,2,...,n, then 


Emre EMkYk ne | 


G = (xo, yo, Zo) = | =m, ’ Emy ’ =m 


Emr Ar = (ZMi(X~ — xo), UME ye — yo), UM«(Ze — Zo)) = (0, 0, 0). 


. The zero vector is the real number 1. “Multiplication” of the “vector” 


a by the scalar à means raising a to power A. Thus, if vector “addition” 
is denoted by @, scalar multiplication by @, 


a © (a ® b) = (ab) = ab = AO a) O Q © b). 


. The complex number a + ib corresponds to the vector (a, b). 
10. 


Take the origin as center of the sphere and let A, B, R be the position 
vectors of P, Q, R, respectively. If the radius of the sphere is p, 


|A|? =|B|? =|R|? = ¢? 
and B = — A. Consequently, from (15c) 
(R — A). R — B) = (R — A) | R + A) =|R?|— |A|? = 0. 
(a) From (X — P) - A = 0, an equation of the plane is 
x + 2y —2z2=-1. 
With the unit normal B=(—1/3, —2/3, 2/3), obtain the normal form 


(b) 2/3. 

(c) Same. 

(a) Set P = (yı, Y2, . . . , Yn) and let B be the position vector of P. If 
Q = (x1, X2, . . ., Xn) with position vector X is the foot of the per- 


pendicular, then 
A*X=c and B— X=)A. 
Thus A + (B —AA) = c, hence A = (A - B— c)/|A|? and 
X= B + A (c — A«B)/|A|?. 
(b) (—1/9, 2/9, 2/9) and (7/9, —13/9, —5/9, respectively. 
Observe first that C # O; otherwise, 


A.B 
|B|? 


violating the condition that A and B are nonparallel. B+ C = 0. 


A= B, 
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14. 


The angle between the line and the plane is the complement of the angle 
between the line and the normal; that is, 
xA + BB +YC 


sme = To + B+ 2 VAP + BeF OE 


Exercises 2.2 (p. 158) 


1. 


(a) The line x = —1 + 4, y = 2,z2=14 32. 

(b) The plane x= 2 + 3u +v, y = 1 — 2u, z = —4 + u — v;or x+ 2y 
+z=0. 

(c) The two-dimensional linear space of points (x, y, z, w) satisfying 
x + 2y + z = 0 and 2y + 2z + w = —4. 


. (a) Ai = V2 Ei + 2Es. 
. For Ei, only Ei = Aı/| Aı| is possible. Suppose such vectors up to index 


k — 1 have been found. Take Ex = Vz/|Vx| where 
k-1 
Vi = Ak — D (Aw ¢ Ey) Ex. 
p= 


Observe that if E, depends on Ai, A2, . . ., Ap, for w=1, 2,..., 
k — 1, then Ex depends on Aj, Ag, .. ., Ax. 


. Let Ax, R= 1, 2,...,n+1 be any set of n+ 1 vectors. If Ai,..., 


An are dependent so is the full set of n + 1 vectors; if not, the vectors 
Ei,..., En are dependent on Ai,..., An by Exercise 3. Since Ex, 
k=1,2,...,nmay be taken as coordinate vectors, An+1 depends on 
E,..., En; hence, a fortiori, it depends on Ai, A2, ..., An. 


. In the vector form the line has the equation 


Z=At+B 
where B = (b, d, f) and A = (a, c, e). Let Q be the foot of the perpen- 
dicular from P to the line and Xo = (xo, yo, Zo), Xı = (xı, yı, 21) the 
position vectors of P and Q, respectively. Since Q is on the line, for 
some number t, Xi = Ar + B. But, from (Xı — Xo)» A = 0 the de- 
sired distance d is given by 
d? = |Xi — Xo|? = (Ki — Xo) « (At + B — Xo) = (Xi — Xo) ¢ (B — Xo) 
= (xı — xo) (b — xo) + (yı — yo) (d — yo) + (zı — zo) (f — zo), 
where 
(x1, Y1, 21) = (at + b, cv +d, et +f) 
and 


„- — (Xo — B)+ A _ a(%o — b) + clyo — d) + e(@o — f) 
— |A|2 a2 + e+e ° 


. No. To prove this, show that the coefficient vectors (1, 2, 3), (2, 3, 1), 


(3, 1, 2) are linearly independent. For example, use the method of 


10. 


11. 


12. 


13. 
14. 


15. 


Solutions 841 


solution of Exercise 3 to construct a set of three mutually perpendicular 
vectors that depend on the coefficient vectors. 


. This is equivalent to solving the system of linear equations in Exercise 


6 with constants a1, az, a3 instead of 0, 0, 0 on the right 


ae _1 _ 
xı = g (—5aı + a2 + Tas), x2 = T (a; + Taz — 5a3), 


_1i — 
x3 = ig (7a1 — 5a2 + a3). 


. From the solution to Exercise 7 


—5 1 7 
1 
18 1 7 —5 
7 —5 1 
. If a is singular, the column vectors Ai, A2, . . . , An are dependent. If 
a solution X = (x1, X2, . . . , Xn) existed for every Y, then every Y would 


have a representation 
Y = x1Aı + x2Ae + +++ + XnAn, 
but the A; do not span the space. 


—2 3 4 —2 —4 1 
ab=| 1 0 1], ba=|—4 -2 1 
—4 3 2 3 3 0 


A = ad — bc + 0. 


1 | d =’) 

A\—e a] 
Suppose that ae = ea = a and a'e = e'a = a for all square matrices 
a. Then e'e = ee’ = e = œ. 
bial. 
From our definition, a matrix is singular if and only if the column 
vectors are dependent. Thus, at least one of the column vectors can be 
expressed as a linear combination of the others. It follows that any 
image vector in the mapping can be expressed as a linear combination 
of no more than n — 1 given vectors. Conversely, if the dimension of the 
image space is less than n, the column vectors of the matrix must be 
linearly dependent, for if they were independent, their linear combi- 
nations would span n-dimensional space. 


Express X in the form (r cos 9, r sin 9). Then, for 
cos y —sin y 
~ (or Y cos ‘ ‘ 
aX = (r cos (0 + y), r sin(ð + y) ); 
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16. 


17. 


18. 


19. 


20. 


21. 


hence, a may be interpreted as a rotation of vectors through the angle 
y or a rotation of axes through the angle —y. For 
cos Y sin Y 
b=(° |; 
sin y —cos y 
bX = (r cos(y + 9), r sin(y — 9) ); 
a reflection of vectors in the line inclined at angle y with respect to 


the x-axis or a reversal of sense of the y-axis followed by a rotation of 
axes through the angle —y. 


The condition is necessary for orthogonality by (49a). It is also suf- 
ficient, for if Ax is the kth column vector of a, it is the kth row vector 
of aT. By the definition of matrix multiplication aaT = e implies 


0, if j#R 
1, if j= k. 


Set c = ab. If c = (ci), then cT = (ciyjT), where 


Aj * Ax = 


n n 
cy? = Cji = > djk bki = pa biz? axj? = bTar. 
=1 =] 


From Exercises 13, 17, and 16, if a and b are orthogonal, 
(ab)? = bTafT = b-! a“! = (ab). 
which is sufficient for the orthogonality of ab. 
If X = (x1, x2, . . . , Xn) and Y = (y1, Y2, . . . , yn), then by (47), 
(aX) + (aY) = (x141 + x242 +--+ + XnAn) © (yiA1 + yY2A2 + +++ + ynAn) 
= X1¥1 + x2y2 + +++ + Xnyn. 
A length-preserving matrix a must also preserve scalar products; for 
lax + aY|? =|aX|? + |aY|? + 2(aX) - (aY) 
=|X|2+)Y|2 + 2(aX) - (aY) =|a(X + Y)? =|X + Y|? 
=|X|2 + |Y|2+ 2X. Y 


(compare the answer to Exercise 18). Condition (47) follows since each 
coordinate vector Ex is mapped on to the column vector Ax of a. 


Let the particles be Xi, X2, . . . ,Xx and their masses mı, M2, . . . , Mk, 
respectively. Assume the affine transformation is given in the form 
X’ = aX + A. Let the centers of mass before and after transformation 


k k k m . 
be Xo = | 2 m;X;) | 2 mj, Yo = [ 2 mx) | >; mj, respectively. Observe 
j=1 j=l = j=l 
that Xo’ = aXo+ A= Yo. 


Exercises 2.3 (p. 177) 


l. 


(a) 0. 
(b) 2. 


12. 


Solutions 843 


(c) 12. 
(d) (x — y) (y — z) (z — x) (x + y + 2). 


. a + c = 2b. 
. (a) Use det (ea) = det (a). 


(b) Use det (e) = det (aa™'). 


. (a) —1. 


(b) 1. 
(c) —1. 
(d) 1. 


. If all the elements of the determinant vanish, the result is immediate. 


Otherwise, we may suppose ai: + 0, for if ay + 0, we may interchange 
the first and ith rows and the first and jth columns to place ai in the 
first row and column, with perhaps a change of sign in the determinant. 
Multiply the first column by ai;/ai11 and subtract from the jth column 
to make the first element in the jth column vanish. Proceed similarly to 
make the first element in any row vanish. By means of this operation 
and a multiplication of the first row by —1 if necessary, the determinant 
is put in the form 


a 0 0 
0 bıı be 
0 bei b22 


bıı biz —_ 
put it in 
b21 b22 | 


. Since the operations on the subdeterminant can be 


The same procedures applied to the subdeterminant 


the form 


Y 
extended to the rows and columns of the original determinant without 
affecting the zero elements in the first row and column, the desired form 
has been attained. 


. In (66a) the only possible nonzero term is that for which ji = 1, je = 2, 


-Jn =n. 


. In djil Ajg2 +++ Ajnn, let k be the least index for which jr#k. If jx < k, the 


product vanishes. If jx > k, then k must appear as a row index for a 
factor akm, where k < m; hence, again the product vanishes. Thus, 
G11 G22 *** Qnn is the only possible nonzero term in (66a). 


. (a) (x — y) (y — z) (z — x). 


(b) —12. 
(c) 223r! 


.x=3, y=2, z=1. 
. Apply det(a) • det(b) = det (aTb). 
11. 


Use D = (A + 2B) (A — B} 

= [(x + y + z) (x? + y? +27 — xy — yz — xz) J. 
Since the determinant is an alternating form in the column vectors, it is 
immediate that A = A + Bx. For x = —a, the matrix is lower-tri- 
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angular and for x = —b, upper-triangular. Hence, from Exercise 7, 
A + Ba=f(a) and A + Bb = f(b). 
13. From (57a), with e = (cjx) 


= A « (cB) 
= x bk du CjKQj 
k=1 J= 
= B e (cTA). 
14. Set X = (x, y, z), A = (g, h, i), and 
1 1 
a zd 5! 
|1 1 
a=) 9% o gf 
1 1 
g of 


and rewrite the equation of the quadric in the form 
X-(aX)+A*-X+ j=0. 
If the affine transformation is given in the form 
X’ = bX + B, 
its inverse is 
X= cX’+C 


where c = b-t and C = —b™! B. Thus the equation of the quadric in 
the new coordinate system is 


cX’ «(ac X’) + C « (ac X’) + cX’ « (aB) 
+A-+cX’+C-(aC)+A-Bt+j=0. 
Apply the result of the preceding exercise to put this in the form 
X’ + (a’X’) +A. X’ +7’ = 0, 
where 
a’ = cTac, 
A’ = cT(aTC + aB + A), 
j=C-aC+A-BHj. 
15. Compare with the homogeneous linear system 
aıx + azy + dz = 0 


16. 


17. 


18. 


Solutions 845 


bix + bey + ez = 0 


cix + coy + fz = 0. 
If this system has a solution with z = —1, and hence a nontrivial so- 
lution, the determinant D must vanish. Conversely, if the determinant 


vanishes, the column vectors are dependent. 
Thus, there exist constants x, y, z, not all zero, such that 


xAi + yAze + zB=0 


where A; = (a, bi, ci) and B = (d, e, f). It is not possible that z = 0, 
for then A; and A2 would be dependent and all three of the given 2 x 
2 determinants would vanish. We may therefore divide by —z to make 
—1 the coefficient of B; hence, the desired solution exists. 


In vector form the lines may be written as 
X=At+B, X=Ct+D. 


The lines are parallel if and only if A and C are parallel (this includes 
the case that the lines are the same). They intersect if and only if there 
exist numbers tı, and t2 for which Ati + B = Cie + D. Thus, by the 
solution of the preceding exercise, the condition is that the matrix 
with column vectors A, C, B — D have a vanishing determinant; that 
is, 


ai ci bı—dı 
az ce be—dz|=0 
a3 c3 b3— d3 


A set of interchanges that permutes ji, j2, . . . ‚Jn into 1, 2, . . . , n, also 
permutes 1, 2,..., ninto kı, ke, . . . , kn. Consequently, ji, j2,..., 
jn and kı, ko, . . . , kn are either both even or both odd permutations of 
1,2,..., N. 
In vector form this states that the vector equation 
aX =X 

must have at least one nontrivial solution. Rewrite the equation in 
the form of a homogeneous equation: 

(a — ìe) X = O, 
where e is the unit matrix. This equation has a nontrivial solution if 
and only if 

det(a — Ae) = 0. 


In n-dimensional space this is a polynomial equation in à of nth degree 
with leading term (—1)"A”. Thus, a solution always exists if n is odd. 


Exercises 2.4 (p. 202) 


l. 


Let Xo be the position vector of P and express the line in the vector 
form X = At + B. The distance r from P to l is |Xo — B| sin 0, where 
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0 is the angle between P — B and A; hence, 
r =|(Xo — B) x A|/|A|. 


2. The velocity is rw, where r is the distance of the point from the axis of 
rotation. From the solution of the preceding, with B representing the 
origin Xo = (x, y, z) and A = (a, B, y). 


ro = o [(yy — z8)? + (za — xy)? + (xB — ya)?]!?. 


3. Name the position vectors of the three points X1, X2, Xs, respectively. 
If X = (x, y, z) represents any point of the plane, the three vectors 
Xi — X, X: — X, X; — X lie in a two-dimensional space and, hence, 
are dependent. Consequently, 


det (Xi — X, X2 — X, Xs — X) = 0. 


4. Let the equations of the lines be given in vector form by /:X = At + B 
and l’: X’ = A't + B’. The shortest segment PP’ with one end point 
on each line must be perpendicular to both. For, say, PP’ is not per- 
pendicular to l’ at P’; then the perpendicular from P to l’ would be 
shorter. If X and X’ are the position vectors of P and P’, respectively, 

X— X’ = At+B-At +B 
= k(A x A’). 
To determine k, take the dot product with (A x A’)in this equation, 
which yields 
_ (B —B’)- (A x A 
|A x A’| ‘ 
which yields the desired distance d through 
d? =|K — X’|?= R| A x A’|? 


k 


or 
d= I(B — B’))-(A x A| 
|A x A’ ` 
5. The sum does not depend on the choice of origin, since a different choice 
of origin (a, b) amounts to replacing each determinant 


Xk Xk+1 , xk—a Xk+ı— a 
Ak = by Ay = 
Yk Yk+i lyk — b Yrk+ı— O 
Because 
; 'Xxk a | Xk+1 4a 
Ak = Ak — ’ 
yk b yer b 
Xk a | 


each aditional determinant 


b | appears twice in the total, but with 
Yk 

opposite signs. Thus, we may choose the origin in the interior of 
the polygon. The polygon is the sum of the areas of the triangles 
OPxPk+1, k = 1, . . . , n (where Pnaii = P1), but the area of OP«Px+1 is 


Solutions 847 


precisely 


1 


Xk Xk41 

2 Yk Yk+1 

6. Subtract the third row from the first two to show that the determinant 

equals $+ Xı x Xz, where Xi = (xı — x3, yı — ys) and Xe =(x2 — xa, 
y2 — y3). 

7. If the coordinates of the vertices are rational, the area of the triangle 


as defined by the determinant is clearly rational. But, for an equilateral 
triangle with side length s, the area is + s?°v3, where 


s? = (xi — xj)? + (yi — 95)? (i # j). 


is plainly rational. 
8. (a) In vector form, this states 


A (A’ xX A”) <|A| « |A| + |A”|, 
which is obviously true, since 
JA’ x A”| <|A‘] + [A”| 
and 
|D|=|A-«(A’ x A”)|<|A] + |A’ x A” |. 


(b) Equality can hold only if it holds in both the preceding inequalities. 
Thus A, A’, and A” must be mutually perpendicular. 


9. (a) If Band C are dependent, say, C = AB, the identity is trivially true. 
Otherwise, form the orthonormal basis Ei, Ee, Es, where the re- 
spective vectors are unit vectors in the directions of B, B x C, 
B x (B x ©). Write A, B, and C in terms of this basis: 


A = aiFk, + a2EKe2 + asks 
B = bE1, C=aFi + c3Es3 
to obtain B x C = —bcsKe and 
A x (B x C) = bes(a3E1 — aiEs). 
Employ Ei = (1/6) B and Es = 1/cs3 [C — (cı/b)B] to obtain 
A x (B x C) = (aici + a3c3)B — (a1b)C. 
(b) Observe that 
Z = (XK x Y). (X’ x Y’) = det(X, Y, X’ x Y’) 
= det(Y, X’ x Y’, X) 
= [Y x (X’ x Y’)]-X. 
Apply Exercise 9a to obtain 
Z=[Y¥ . Y’)X’—(Y¥- X’))Y]-X 
(c) Apply Exercise 9a to rewrite the expression on the left as 
U=[(X- Z)Y — (X + Y)Z]- V, 
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where 
V=([(¥ - X)Z— (¥- Z)X] x [(Z* Y)X — (Z ». X)Y] 
= (Y - X) Y- Z) (Zk X)4+ (Ke Y) (X- Z) (Y X Z) 
+ (Ze Y) (Ze X) (X x Y). 
Thus, 
U= (X- JA. X) (¥ + Z) [(Y- (Z x X)] 
— (X. Y) (Ze Y) (Z+ X) [Z- (X x Y)J = 0. 


10. Let E be the unit vector in the direction of (—1, 0, 1); thus, E = (—3¥vQ, 


0, 4V2). Let X = (x, y, z) be the position vector of any point and A 
the foot of the perpendicular from the point to the axis of rotation: 


A=(X-E)E= Re — 2), 0, RC — »)). 


Note that X — A is perpendicular to A and introduce the mutual 
perpendicular E x (X — A) to these two. If X’ is the position vector of 
the image of (x, y, z) in the rotation, then X’ — A is perpendicular to 
A and the given orientation condition yields 


(X — A) x (X’-— A)= r? sin ¢ E, 
where r= |X — A|=|X’— A| is the distance of X from the axis. 
Set 
X’ = AA + «(X — A) + [E x (X — A)] 


as we may, since the vectors appearing in the linear combination are 
mutually perpendicular. From (X’— A)» A=0, it follows that 
à = 1; from (X’ — A). (X — A) = r? cos ¢, we have u = cos ¢. Fi- 
nally, from Exercise 9a 


r2 sin ọ E = (X — A) x(X’ — A) 
= (X — A) x [E x (X — A) ] 
= vr?E; 


thus, v = sin ¢. Employ 
K—A= (Set 2.9, se + 2)| 


Ex (K-A)=EXX=5/2(-y, 242, —y) 


to obtain X’ = aX, where 


1 l=. 1 
z(COS ọ +1) —5¥ 2 sin ¢ 5(COs @ — 1) 
a= E sin ¢ cos ¢ v2 sin ¢ 


1 1 . 1 
z(COS @ — 1) -5 2 sin ġ z(COS @+1) 


11. 


12. 


Solutions 849 


From Exercise 9a, 
X = [ (A x B)- DIC — [ (A x B)- CJD 
= [ (C x D)- A]B — [ (C x D)- BJA. 


Since A, B, C are independent, (A x B) « C + 0 and we may solve for 
D. 

Let Ei’, Ee’, Es’ be the unit coordinate vectors in the new coordinate 
system. We are given Es « Es’ = cos 9, Ei X (Es X Es’) = sin 6 sin ¢ Es, 
and Ei’ x (Es x Es’) = —sin 9 sin ọ¢ Es’. Furthermore, E1 « (Es X Es’) 
= sin 9 cos ¢ and Ey’ « (Es X Es’) = sin 0 cos 4. Thus, from Exercise 
9a, (Fi « Es’) = sin 9 sin ¢ and Ey’ « Es = sin 9 sin }. Now, set 


3 
Ei = } aij Ey 
j=1 


where 
(aij) = (Œ: » Ey’) 
is the matrix we seek. The information we already have yields 
aı3 = sin 9 sin ¢, ası = sin 9 sin v, a33 = cos ð. 


Form Es Xx Es’ = sin 9 sin | Ee’ + a32 Ex’ and take the scalar product 
with Ey’ to find 


Ey’ « (Es X Es’) = sin 9 cos v = ase. 
Thus, 
Es = —sin 9 sin | Ei’ + sin 9 cos t Ee’ + cos 0 Ez’. 
Using this expression for Es, solve for ai1 and aı2 in the equations 
E: ° Es = 0, |Ei|? = 1, 

to obtain 

ai1 = —cos 9 sin ¢ sin § + cos ¢ cos 9, 

aı2 = —cos ô sin ¢ cos | + cos ¢ sin ¥. 


The undetermined signs in these expressions for ai1 and aig are fixed 
by the condition Ei « (Es X Es’) = sin 9 cos ¢, which yields the plus 
sign in the expression for aıı and the minus sign for a12. Set E2 = Ez x Ei 
to obtain, finally, 


—cos 9 sin ¢ sin v —cos ô sin ¢ cos Ņ sin ô sin ¢ 
+ cos ¢ cos ¥ —cos ¢ sin Ņ 
(aij) =| cos 9 cos ¢ cos Ņ cos 9 cos ¢ cos Ņ —sin 9 cos ¢|. 
+ sin ¢ cos v —sin ¢ sin ġ 
sin 6 sin ọ sin 6 cos ¥ cos 0 


Note that this result holds also for 0 = 0 or x, when ¢ and | become 
indeterminate with ¢ + ọ = x0x’ or ¢ — b = x0x’, respectively. The 
angles ¢, ), 0, are so-called Eulerian angles, and our result shows that 
the most general orthogonal matrix with determinant ^ of value +1 
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13. 


may be expressed “‘parametrically’’ by means of the three variables 
$, v, 9, subject to the inequalities 


O< ar, O0OS¢< 22, OSd <2rz. 


Let A = aiki + a2E2 + +» + amEm be a nonzero vector of = perpen- 
dicular to all the vectors of x’ with, say, a1 # 0. Using Ei = 1/ai(A 


— azK2 — +++ — amEm), we obtain from (85a) 
u = [A — aaEe — ++ — anEm, Ee, os ., Em Ey’, . ., Em’) 
=+IA, E2, ..., Em; Ey, Ey’,..., En] = 0. 
1 


Conversely, if u = 0, the column vectors in the determinant repre- 
sentation (85a) of u are dependent: for some nontrivial set of coef- 
ficients, 


MEg + Ex’ + 2Ex + Eo’ + + AmEx Enw =0 (k=1,2,...,m). 
Then 
Ex e (MEY + AeEe’ + +++ + AmEm’) = 0 


and we have a vector of v’ orthogonal to every basis vector and, 
hence, every vector of r. 


Exercises 2.5 (p. 215) 


1. 


-a 
Let the coordinates of P be (x1’, x2’, x3); of Q, (x1”, x2”, x3”). Thus PQ 
represents the vector U, where ui = xi” — xi’. The coordinates of P 
and Q in the new system are given by (89a) with appropriate primes and 


—__» 
PQ represents the vector v: = yi” — yi’ whose components clearly 
satisfy (89a). 


. Let the curve be expressed vectorially by X (t), and let the three values 


of the parameter be given by t, tı, t2, and the corresponding points by 
X = X(t), Xı = X (tı), X2 = X(te). The normal to the plane through the 
three points is parallel to 


(Xi — X) x (Xe — X). 


Setting tı — t = hı, te — t = he and using Taylor’s theorem, obtain 


_ dX, _,1d*X%,,,.., 
Thus, to lowest order, 
_1dX dP'X ihk _ kke 
(Xi — X) x (X: — X) = 3 dt de (hk? — kh?). 


In the limit as h and k approach 0 and as t approaches to, the normal 
to the osculating plane takes the direction of dX/dt x d?X/dt? at Xo = 


10. 


Solutions 851 


X(to). Thus, the position vector Y of a point of the osculating plane 
satisfies 


dX Tr) = 0 
dt? l 


(Y — Xo) - Eik = 


. From the result of the preceding exercise, we must show that dX/ds 


and d?X/ds? are both perpendicular to dX/dt x d?X/dt?. This is imme- 

diate from 

dX _dXdt EX _ dX dt, dX (dt)? 
ds 


ds dtd ®™ ds? dt ds? dé 


. Let the curve be given by X(s), where s is arc length, and expand X by 


Taylor’s theorem: 
X(s) = X(so) + X(so)l + YOCI?), 
where l = s — so and Y is bounded. Thus, since | X’(so)| = 1, 
d —1=|X(s) — X(so)|— l 
= |X(so)l + YOCI?)|— 1 
<|X’(so)|2 + OC?) — l; 
that is, d — 1 = O(P) = o(l). 


. From the solution to the preceding problem 6. 
_ |X , a*t „ {dt\? 
k= J |7|’ aoe t ® (aah 
Note that 
d_i. 
ds |xX’|’ 
hence, 
d?t _ oX e X” 
ds? [X’|4 © 
Thus, 
— IX PIX” — X’ > XP 
| X"|° 
. From the solution to Exercise 6, d?X/dt? is a linear combination of 
dX/ds and d?X/ds?. 
Let C be represented by X(t) and assume that the position vector 


X(to) of B is not an end point of C. Let Y be the position vector of A. 
|Y — X(to)| is a minimum if 


d 
—]| Y — X(t)|? = 0; 
dt | t = to 
that is, 
[Y — X(to)] ° X’(to) = 0. 
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11. Let the curve be given parametrically by X(0) where x =a cos 9, 


12. 


13. 


y = asin 0. The tangent plane depends only on x and y, not z, and it 
makes the angle 9 with the y-axis. The z-component of the tangent 
vector X’ to the curve satisfies 


/ 


zZ 


METETE = cos 0, 


or 
2’ 
Va? 4 2 = cot 0. 
Thus, 
z' = +a cot 8; 
whence, 


z = c + a log sin 98. 


For the curvature, see Exercise 8. 
From dX/d9 = (—sin 9, cos 9, sinh 40), we have 


dX _ o 
qe > (—cos 8, —sin 6, Acosh 49), 
the solution yields the equation for any point Y of the osculating plane 
Cy x. (%y 2X 
where the normal vector is given by 
dX _ d?X 
do x daz = (Ni, Noe, N3) 


and 
Ni = A cos 0 cosh AO + sin 9 sinh A9. 
Nz = A sin 8 cosh 49 — cos 9 sinh AO 
N3 = 1. 


The distance of the plane from the origin is |X « N|/|N]|, and, since 
X+-N=(A+1/A) cosh A90 and|N|? = (A? + 1) cosh? A9, the result 


follows. 

(a) Let X(t) be the parametric representation of the curve and set 
Xi = X(t). The plane through the three points, by Exercise 3 of Section 
2.4, is 


(Xi — X)- [ (Xe — X) x (Xs — X)J = 0 
or 
X ¢ [Xi X X2 + X2 X Xs + X3 X Xi] = Xi °. (X2 X X3), 
from which the result follows. 
(b) The three osculating planes have the equations 
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(X — Xi) « CX’ X Xi”) = 0 
(from Exercise 6) or, in terms of coordinates, 


Bx Gti, BH? ag 
a pI Tt e 7 tii? = 0. 


Thus, if (x, y, z) is a point common to the three osculating planes, 
tı, t2, t3 are the three roots of the above equation with coefficients: 


ttt t= 2, 


tite + tets + tst1 = $y , 


titet3 = 3x . 
a 


14. Since a sphere is determined by any four of its noncoplanar points, we 
may impose four conditions on the sphere of closest contact: that the 
contact of curve and sphere be of third order. Let X(s) be the repre- 
sentation of the curve in terms of arc length and A the center of the 
sphere. Require that |X — A|? vanish to third order; thus, from |X|? = 
land X-X=0, 


(X — A)-X=0, 
(X—A)-X+1=0 
(X —A)-X=0. 


From the first and last of these equations, X — A = (X x X), where i 
is given by the second equation. Hence, 


A=X+ 97x 
15. Set |X — A|= 1 in the solution of the preceding exercise. 
16. Since, by Exercise 6, §3 is normal to the osculating plane, t = |B]. 
Furthermore, since & and & are perpendicular 
Eo = abi + bős and &3 = chi + déz. 
Differentiate §1 = §2 x §3 to obtain 


ots = (2 X &3) + (62 X &3) 


= —abe — cés; 
hence a= —1/o and c= 0. From s = d&éz, d = +1/t; choose the 
minus sign. To determine b, differentiate 3 = (61 X &2): 


63 = —*te = (61 X $2) — (52 X §1) 
= —b %2; 
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17. 


18. 


19. 


20. 


21. 


whence b = 1/t. 
(a) Differentiate X = %1 = k&e to obtain 


X = kta + kbs 
= — kba + kEe + Ë a. 


(b) From the result of Exercise 14, 


Ee È 
Z + gaz $3- 


Since 1/7 = |ġ3]| = 0, then §s = 0 and, therefore, 5s must be a constant 
vector. From 0 = 1+ &3 = ÙX • %3 = 4 (X * &3), it follows that X + 53 


= constant. 


Let A and P be the position vectors of A and P respectively. Set X = 
A — P, hence X = —P. The equation states 


d ee Ë 
a X= a P, 


which follows directly from the differentiation formula 


X.X 
|X| 


diy d wrx 
at I= VX X= 
with a = X/| X|. 
(a) Set X = A — P as in the preceding solution. From that solution, 


_p=X= © (iXla) = (a+ P)a +|X|a. 


and the desired result is immediate. 
(b) Introduce the expression for a and the similar expressions for bin 


Ë = va + vb + wè + ġa + ób we. 


(a) Let the curve be given by X (t). The surface then has the parametric 
equation 


y = X(t) + 2X@ 
The vector dy/0\ x dy/0t is normal to the surface, but 


oy x 7 = X(t) x Š + AKO] =aX@ x XO 


is also normal to the osculating plane. 


(b) Set Y = (x, y, z) and X(t) = (a(t), B(t), y(t) ). Thus, x and y are func- 
tions of t and A satisfying 


x = a(t) + Ad(t) 
y = B(t) + 8). 


Solutions 855 


Use 
u(x, y) = y(t) + ay(t) 


to calculate uzz, Uyy, and Uzy in terms of derivatives with respect to 
t and A. 
Differentiate Y = X(t) + AX(#) with respect to x to obtain, (A = s) 


Yz = (1,0,u1) = (X + AX)tze + Xsz. 


Form X x Y, and equate components in the x and z directions to 
obtain 


bus = st2(B, Y), Ê = —stz(x, B), 
where (u, v) is defined by 
(u, v) = uv — Vii. 
Thus, 


BY) p b 
(x, B)’ s(a, B) ` 


Similarly, from X x Y, obtain 


Ur = 


D, __ 4 
(a, B)? ~~ s(a, B) 


Note that uz and uy do not depend on à. Consequently, 


Uy = 


Uss = teus = a B) i a D 
Uyy = ty uy -7 a B) J: A 5 
and 
Ury = ty Zuz = cl B) S a 5 
= te Sty =— if B) a e OF 


from which the result is immediate. 


Exercises 3.la (p. 219) 


1. Set yn+1 = yn + cf(a, yn), where c is constant, and apply the methods of 
Volume 1, Sections 6.3c and d, with ẹ(y) = y + cf(a, y). To guarantee 
convergence, we require |9’(y)| Sq < 1 on some interval containing b, 
and the smaller the q, the better. Consequently, we attempt to fix c so 
that 9’(y) is nearly zero, or 


POE 
fy(a, b) ` 
Thus we begin with the assumption fy(a, b) # 0. 
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In practice, we choose c = —1/f,(a, yo), where yo is close to the sought- 
for solution b. The condition for convergence then becomes 


/ __ | f(a, yo) — fsla, y) 
lo’(y) |= fy(a, yo) 


for all y in some neighborhood of b. Suppose fy satisfies a Lipschitz 
condition 


<q<l 


| fy(a, n2) — fy(a, 1) | < K |n2 — nı! 


on some neighborhood of b. Within this neighborhood, let e be the 
radius of some perhaps smaller neighborhood where Of/dy is bounded 
away from 0, 


fala, y) >m>0; 


such a neighborhood exists by virtue of the Lipschitz condition and 
fy(a, b) # 0. For an initial choice yo satisfying 


—bli< Tl 
| yo |< max |: OK 
the iteration scheme converges to b through 


1 
[Yn — b| S5a"lyo — b|. 


Exercises 3.1b (p. 221) 


1. (a) The tangent plane is horizontal. The surface intersects the tangent 
plane in the pair of lines y = x and y = —x; hence, y cannot be ex- 
pressed as a function of x in the neighborhood of (xo, yo). 

(b) The surface is a cylinder with generators parallel to the vector 
i — j. Thus, the line y = 1 — x, z = 0 lies on the surface and yields 
the desired solution y = 1 — x. 

(c) The surface is a cylinder with generators parallel to i — j. The 
solution is y = 1/2 — x. 

(d) The tangent plane y + z = 0 is not horizontal. Thus, the curve 
f(x, y) = 0 is tangent to the line y = 0 at the origin. 


Exercises 3.1c (p. 225) 


1. By subtracting the constant on the right from both sides, we may put 
each of these equations in the form F(x, y) = 0. The conditions of the 
theorem are satisfied. In particular, each given point is an initial so- 
lution F(xo, yo) = 0; and Fy(xo, yo) has nonzero values, namely, (a) 4, 
(b) —1, (c) 2, (d) 6. 


_2x+y, 5 


x+ 2y’? 4 
(b) Explicitly, y = z/2x; hence, y = —7x/2x?. Implicitly, 


2. (a) 
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, _COot xy — xy, T 
y x2 9 9 . 
(c) Explicitly, y = 1/x; hence, y = —1/x?. Implicitly, y = —y/x; —1. 


(d) y = 219 


xe + By?’ 
n Z6 + xy ty) _ 42, 
3) = GF Get Bye? 82" 


no 7”, 
(b) y =? T 


1 2y 2, 
(c) y = 
(a) y” = — H50 y0 — xy) + 206° + y°) + 8xy — 30), _19 


(x + 5y4)8 3 
4. From the positive sign of their second derivatives, b and c. 
5. Assume that the equation defines y as a differentiable function of x in a 


neighborhood of each extreme value. Then at an extremum F(x, y) = 0. 
Maximum, y = 6; minimum, y = —6. 


6. Set F (x, y) = y — yo — f „fE y)d& and note that 


Fy(x, y) =1— f fulé, A > 0 


for x sufficiently close to xo. 


Exercises 3.1d (p. 228) 


1. f(x, y) = y8? + x near (0, 0). 

2. Same as for Exercise 1. 

3. Since F(x, y) = (8y2 — 2y + 1) + x? is the sum of a positive quadratic 
expression in y and a square, it follows that Fy(x, y) > 0 for each x and 
all y. Consequently, for each x, F(x, y) is strictly increasing in y. Thus, 
F(x, y) = 0 can have no more than one solution y corresponding to each 
fixed x. Such a solution must exist because for each x, y3 — y2? + (1 + x?)y 
= G(x, y) takes on arbitrarily large values of both signs, positive and 
negative, for appropriate values of y. It follows by the intermediate value 
theorem that G(x,y) takes on all real values. In particular, for some value 
of y, G(x, y) = ¢(x); hence, for each x and this value of y, F(x, y) = 
G(x, y) — o(x) = 0. SE 


Exercises 3.le (p. 230) 


1. Set F(x, y, z) = x + y + z — sin xyz. FO, 0,0) = 1 Æ 0. 
ðz __ yz cos xyz —1 əz _ xz cos xyz —1 


— 


Ox 1 —xy cos xyz >° dy 1 —xy cos xyz’ 


858 Introduction to Calculus and Analysis, Vol. II 


2. 


Since each equation can be put in the form F(z, x, y, . . . ) = 0, where 
F is formed by rational operations and application of continuously 
differentiable functions of one variable, it is only necessary to test that 
the derivative Fz at the point is nonzero. 

(a) Fz=1 

(b) F z = —6 

(c) For F(x, y, z) = 1 + x + y — cosh(x + z) — sinh(y + 2), F: = 1. 


. For f(x, y, z) = x + y + z + xyz’, fz(0, 0, 0) = 1 + 0. Second- through 


fourth-order terms vanish; z = —x—yteee. 


Exercises 3.2a (p. 235) 


l. 


(a) Equation satisfied only by point (0, 0); tangent and normal do not 
exist. 


(b) (& — x) [e* sin y — e” sin x] + (n — y) [e7 cos y + e” cos x] = 0; 
(n — x) [ef cos y + e” cos x] — (n — y) [e* sin y — e” sin x] = 0. 


(c) Equation satisfied only by points (—1, 7/2+ 2kr); tangent and 
normal do not exist. 


(d) (& — x) (2x + cos x) + (n — y) (2y — 1) = 0; 
(E — x) (2y — 1) — (n — y) (2x + cos x) = 0. 

(e) (& — x) (8x?) + m — y) (4y? — sinh y) = 0; 
(E — x) (4y? — sinh y) — (n — y) (3x?) = 0. 


(£) Equation satisfied only on positive x- and y-axes. For x = 0, y > 0, 
tangent is x = 0, and normal, » = y; for y = 0, x > 0, tangent is 
y = 0, and normal & = x. 


—1 


. From Volume I, p. 437, Problem 5 of 4.1h, 


r? + 2r? — rr” 
(r2 + r®32 ’? 
where the primes indicate derivatives with respect to 0. Enter the 


expressions for r’ and r” in terms of the partial derivatives of f in the 
formula for k to obtain 


k= r2f,3 + r(fr*fooe — 2fofrfre + fo*frr) + 2fo?fr 
(fa? + r2f2)3/2 . 


k= 


. Observe that Fez = Fyy = 6(x + y — a)=0 when x+ y =a. Apply 


(13): 
Fy? Fez — 2F PF yP zy + Fz2Fyy = —54axy Pry = 0, 


since xy = 0 at an intersection. 


.a=+1, b=—4. 
. The circles K, K’, K” may be denoted by the equations 
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K = x2 + y? + ax + by+c=0, 

K' = x2 +y +a x+ by+ ee =0, 

K” = x2? + y+ a”x + b’y+c’ = 0. 
Then any circle passing through A and B is given by K’ + AK” = 0. 
The conditions that the circle K should be orthogonal to K’ and K” are 
aa’ + bb’ — 2(c + c^) = 0, aa” + bb” — 2(ce + c”) = 0. From these condi- 
tions the corresponding relation expressing the orthogonality of K 
and K’ + AK” readily follows. 


Exercises 3.2b (p. 237) 


1. 


(a) Double point 

(b) Two branches tangent to x-axis 

(c) A corner: for x = 0+ the slope is 0, for x = 0- the slope is 1 
(d) Cusp 

(e) Cusp. 


. The coordinate axes. 
. y = x?(1 + x1/2), The two branches of the curve forming the cusp at the 


origin lie on the same side of their common tangent. 


. The curves are obtained by rotation through the angle « from the curve 


(x — b} = cy?. 


. Differentiate the equation F = 0 twice with respect to x and use the 


fact that Fy = 0. 


Ọ = arc tan 2V Fry? — Perk yy : 


Pax + Fyy 
thus, 


(a) 7/2; 
(b) x/2. 


. Note that the tangents at the origin are y = 0 and ax + by = 0. In the 


respective cases, expand y to second order: 
E Se _ a 

y= 9 yo” x? + and y= b 
Enter these expressions in the original equation to obtain yo”. 


k=, p= A08 — abf — abe —b%e) 
a’ a(a2 + 62)3/2 . 


TETEE my 


Exercises 3.2c (p. 240) 


1. 


(a) 5x + Ty — 21z+9=0 
(b) 20x + 13y + 3z = 36 
(c) x-—y—z+7/6=0 
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wt) 


47 


(dd) x + 2z—2=0 
(e) The surface has no tangent plane at the point. 
(f) z=0. 


. Each equation is in the form F(x, y, z) = constant. The vectors (Fz, Fy, 


F.) perpendicular to the respective surfaces are given by 


02-9), (2. es estes 
zg 2P WWF Vy? + 22? v¥x®@ + 22 Vy? + 2)’ 


(ss oJ as - FS 
V@te’ PFA vepe Vy tal 


The scalar product of any two of these vectors vanishes. 


. x (y + z2) = ay. 
_ Since this is a surface of revolution, we may assume y = 0, Let (a, 0, c) 


be a point of the surface, that is, a? — c? = 1. The tangent plane at the 
point is ax — cz = 1. The intersection lines are (z — c)c = (x — a)a 
= +acy. 


. From Euler’s relation the equation 


(E — x)Fr + n — y)Fy + © — z)F: = 0 
for the tangent plane can be put in the form 
EF + Fy + YF: = xFr + yFy + zF: = hF(x, y, 2) =h. 
Iz 2I 
T xy OO 22 — xy" 


. (a) 0 


(b) arc cos 1/76 
(c) arc cos 4/5 
(d) x/2 

(e) Not defined. 


Exercises 3.3a (p. 246) 


1. 


(a) Circles &2 + n? = e®*; lines through origin € sin y — n cos y = 0. 

(b) Parabolic arcs, n = Vx? — Æx , n = vy? + Ey. 

(c) y = cos x(1 + 1/62), n = cos y(1 + &?). 

(d) Parabolas — = n? — 2n(x? + 1) + xt + 3x + 1, n= E2 — Dey + y4 + 
y+. 

(e) E = x0? q = yh, 

(£) Lines £ = constant, » = constant(y = 1). 

(g) Elliptical arcs &? — 2 sin 2x + n2 = cos? 2x, &2 — 2&n sin 2x + n? 
= cos? 2y. 

(h) Segments & = e°8 2, (e-! <n Se), q =e, (e! SE Se). 


. The equation admits only the values x = y = 0. Hence, the region is the 


plane. Its image is the open first quadrant in the £, 7-plane. 


Solutions 861 


3. The region bounded by the two circles &2 + n? = 8, &2 + n2 = 32 and the 


hyperbolas ©? — n? = 2, &2 — n2? = 6. 


4. No. The origin of the £, y-plane is the image of any point (0, y). 


Exercises 3.3b (p. 248) 


1. 


For this, it is only necessary to show that at a given point with Cartesian 
coordinates (a, b) the curves & = «, ņn = 8, where« = (sin b)/(a — 1) and 8 
=a tan b, have different directions. For & = a, 


dx _(a— 1)cos b 


dy sind ° 
for n = p, 
dx _ —a 


dy cos? b sin b`’ 


Thus, curvilinear coordinates are defined for all points except those that 
satisfy cos? b = a/(1 — a). 


. (ŒE — 128 + XE — 1)? = 1. 
. As in the solution of Exercise 1, those points with Cartesian coordinates 


(a, b) for which the curves § = « and n = ß have the same direction, 
in this case, the points on the 45°-lines b = +a. 


Exercises 3.3c (p. 251) 


1. Use 
E2 + n? + OP = (x? + y? + 2)? 
to obtain 
—~ _§ — l - ~__§ 
ec ae. Tepee 
2. | r= yx? + y2 4 22+ w? 


TET 


ġ = arc tan , Ņ = arc tan x ; 


VPP REE 

Ww 
0 = arc tan 2/y. Here r = constant, is a three-sphere of radius r centered 
at the origin; ¢ = constant, is the hypercone generated by all lines 
through 0 making the angle ¢ with the w-axis; the set } = constant is 
the union of all planes through the w-axis that meet the x axis at the 
angle 4. The set 0 = constant is the union of all three-spaces contain- 
ing the x- and w-axes that meet the y-axis at angle 8. 


Exercises 3.3d (p. 255) 


1. (a) ad — be (d) 


i 
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(b) 1/Vx + y? (e) —3x?y? 
(c) 4xy (£) 9x22 + 1. 
2. If ad — bc = 0, all points; if ad — bc +Æ 0, none. 
(b) None. (The transformation is not defined for x = y = 0.) 


(c) The coordinates axes. 
(d) None. Note, however, that there is no over-all inverse because the 


points( x, y + 2nz) all have the same image. 


(e) The coordinate axes. 
(£) None. 


3. (a) D=e?*; xe = yn = ENE + n2); xn = — ye = nA (E2 + n2); xee = yen = 


(b) 


(c) 


(d) 


(e) 


(a 


~~ 


—xXnn = (E? — y?)/(E? + ?)?5 yeg = — xen = — ynn = —2En/(E2 + n2). 
D = A(x? + y3); with r= Ve +772, 0=arc tan një; xe= yn = 
vr cos 49; ye =—xn = —}$ Vr sin $0; x = yen = —Xnn = 
—} r?2 cos 30/2; yee = —xen = — Ynn = tr?/? sin 36/2. 

D=2 sin(x — y)/cos?(x + y). xe = ye = 1/2(1 + 8); xn = yn = 
V/2V1 —72; xee = yee = — ENCL + €2)2; xen = yen = 0; Xnn = — ym = 
/2(1 — ?)8”, 

D = cosh(x + y); x; = (coshy)/D; xn = —(sinhy)/D;ye = (sinhx)/D; 
yn = (cosh x)/D. 

xtg = — [cosh?y sinh(x + y) + sinh? x]/D3; 

xen = 3[sinh 2y sinh(x + y) — sinh 2x]/D3; 

Xxnn = —[sinh?y sin(x + y) + cosh? x]/D?; 

yee = [cosh?y — sinh? x sinh(x + y)]/D3; 

yen = —}[sinh 2y + sinh 2x sinh(x + y)]/D3; 

ynn = [sinh?y — cosh? x sinh(x + y)]/ D8. 

D = 6x°y — 3y4. xe = 2x/3(2x3 — y3) 

Xn = —y/(2x8 — y?), ye = —y/3(2x3 — y3); 

yn =x? /y(2x? — y?). xe = — §x(8x3 + 5y3)/(2x3 — y9)8; 

xen = 2y(7Tx3 + y3)/3(2x3 — >); 

Xon = —2x7(x8 + 4y*)/y(2x3 — y?)3; 

yee = 2y(7x3 + y?)/3(2x? — y3)? 

yen = —2x?(x8 + 4y8)/3y(2x3 — y3)? 

yan = 2x(y® + 3x3y3 — x®)/y3(2x3 — y3)8, 


Let mı and mz be the slopes of two curves passing through the 
point (a, b) of the x, y-plane. Let ui and ve be the corresponding 
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slopes at the corresponding point in the £, y-plane. Use 


_ dn _ dnidx _(@n/Ox) + mny) _ ma? — b?) —2ab 
~ dé dé/dx (a&/0x) + m(@E/ay) ‘ob? — a? — 2mab 


to obtain 


u2 — u mi — Me 
T+ume 1+ mime’ 
Thus, the angle between the two curves is preserved in magnitude 
but reversed in orientation. 
(b) Observe that &2 + n? = 1/(x? + y?). Express the circle (x — a)? + 
(y — b} = r? in the form x? + y? — 2ax — 2by = r? — a? — b?. This 
transforms into the curve 


1 _ 2% by a ap 
pna B+ Bp OT 

or 
(E2 + 72) (r? — a? — b?) + 2a% + 2by = 1. 


This is a circle in the &, y-plane unless the original circle passes 
through the origin; then r? — a? — b? = O and the image is a 
straight line. 
(c) —1/(x? + y?)?. 
5. By the solution of Exercise 4(b), an inversion maps P:P2P3 into an 
ordinary triangle with the same angles. 
6. Let mi, m2 be the slopes of curves passing through the point (a, b) and 
u1, u2 the corresponding slopes of their images. From 


u = Aide _ Yz + mpy _ Pz + mby 
du/dx ¢z+mdy by — mb,’ 


it follows that 


u2 — m _ me— mi 
1 + guei 1+ mm’ 


7. The normal is given by 


6—x_n—-y 
Uz Uy 


=u— zZz. 


It passes through the z-axis if and only if xuy — yuz = 0. The surface 
is a surface of revolution if and only if z = f(w) where w = x? + y2. 
Thus, the curves z = constant and w = constant are the same and the 
mapping (x, y) — (w, z) must have a vanishing Jacobian, that is, 


d(w,2) _ » 
a(x, y) 


8. (a) Ifeither t < b (ellipse) or b < t < a (hyperbola), the foci are (0, +c), 
where c = Va — b. 


x IY 
Ux Uy 


= 0. 
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(b) If we denote the left-hand side of the equation defining tı and tz by 
F(x, y, t), two curves tı = constant and tz = constant are given 
implicitly by the equations F(x, y, tı) =1 and F(x, y, tə) = 1, 
respectively. The condition that these should be orthogonal is 
therefore 


0 = Fix, y, t1) Fel, y, te) + Fy(x, y, tı) Fy(x, y, te) 
— Ae + ee.) : 
(a —ti)(a— t2) (b— tı) (b— t2) ° 
but this relation is an immediate consequence of F(x, y, tı) — 
F(x, Js t2) = 0. 
(c) The coefficients of the quadratic equation defining tı and te are 


equal to tı, t2, and —(tı + tz), respectively. We thus obtain two 
linear equations in x? and y?, whence 


x= + /@— 4) @— t), y=+ ©, 


a—b a 
(d) d(tı, t2) — — YG 2 , 
d(x, y) v{(a + b} — 2a — b) (x? — y?) + (x? + y?)?} 
(e) fi'gy fo’ gz 


(a—t1)(b—th) (a — ta) (b — ta) 


9. (a) Let F (t) be the left-hand side of the equation defining t. F is a 
continuous function of t in —œ < t < c, for which F(—oco) = 0, 
F (c — 0) = +œ; hence, F = 1 at one point at least of that interval. 
Similar conclusions apply to the other intervals. 


(b) Cf. Exercise 8 (b). 
— tı) (a — t2) (a — t3) 


. _ (a 
(c) Cf. Exercise 8 (c). x = +,/ (a—ba—o) , 
with similar formulae for y and z. 
10. (a) Apply the result of Exercise 6. 


(b) Let x = r cos 9, y = r sin 9. Then the straight line 0 = constant is 
transformed into the conic tı = 4 — cos? 9 and the circle r= 
constant. into the conic t = —}[r? + (1/r?)]. 


11. (b) Use (24d) as follows 


or apply the result of part (a). 


Exercises 3.3e (p. 260) 


exp[2x/(x? + y9] 


1. (a) 1. (b) 4x3, (c) (x2 + y2)2 
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2. (a), (c). In part (b), uo = vo = 1 is not in the range of the composite 
transformation. 


3. Apply (31b). 
4. The inverse transformation 
x = pl, n), y= a, n) 


exists. The first result is obtained by forming the composition of the 
given mapping with 


z = f (pE), a(n)) = «€, n) 


n= n = BE, n), 
whence 
d(z,n) _ diz, n) dlx, y) _ d, n)/d(x, y) 
dé, n) d(x,y) dÉ, n) dE, n)/d(x, y) 
But 
0z Oz 
d(z,1)_|a ð| — 9% 
d(&, n) i 3E ` 


Exercises 3.3f (p. 266) 


1. (a), (b). In part (c), the given values do not satisfy the equations. 


Exercises 3.3g (p. 273) 


1. With w = v — 1, 


xe = 1 + 3 (u + w) + È (u? — 2uw — w', 
ye =1— 3 (u — w) +È (u? + 2uw — w%), 


2. The same. 


Exercises 3.3h (p. 275) 


1. =x + x|x]|, n=y. 
2. If the functions are dependent, 0(E, n)/0(x, Y) = aß — ba = 0. 


Exercises 3.3i (p. 277) 


1. (a) —e®* cos y 
(b) 0. 
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(c) — [ele i eo a come Z — (cosh z)y?-! sinh xl. 


(d) — x? sin z. 


(e) x. 


. There exists a region on which some function of &, n, € vanishes. The 


condition for this is 0(E, n, 0)/0(x, y, z) = 0. 


. The triple of Exercise 1(b) is dependent: 


(n? + e2) [n + pẹ — &)? + €] = 2 + pX. 


1 1 
ety = 2y 2z\|=0; &—yn-—20=—0. 
ys y+z x+z y+x 


. (a) Since the angle between two surfaces is the angle between their 


normals, we need show only that the angle between any two di- 
rections is unchanged. Let s be arc length on any curve in x, y, z 
space and t = (x, ý, 2) = X the unit tangent vector, where the dot 
denotes differentiation with respect to s. The direction of t maps into 
the direction of t = __& 7,8) = Y/|Y|. The image direction 
(£2 + y2 + 2212 /\¥| 8 
t is given in terms of t and X by 
2(t « X)X 
|X|? 

From this it follows easily that the cosine of the angle between two 

curves meeting at X is given by 71 « t2 = tı ° te. 
(b) Follows as does the solution of Exercise 4(b), p. 256 
(ce) — 1/(x? + y? + 278. 


t=t— 


Exercises 3.4a (p. 286) 


l. 


. EG — F? = 


(a) ds? = sin?v du? + dv? 
(b) ds? = cosh?v du? + (1 + 2 sinh?v)dv? 
(c) ds? = (1 + f)dz? + f? dé? 


(tı — te) (tı — ts) (t2 — tı) (te — ts) 
la- mememl] + a tab t e—a 


E = G = cosh? (tja), F = 0. 


(d) ds? = 


. Xu = (cos v, sin v, «); X» = (—u sin v, u cos v, 0); hence, Xu * X» = 0. 


. ds? = (1 + 22?)\dx?2 + Qzrzy dx dy + (1 + Zy} dy?. 


2 


Yu Zul? Zu Xul? |Xu Yu 


; use the 


Yvo Zv Zv Xv Xv Yv 


transformation formula for Jacobians. 
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6. Introduce coordinates x, y, z such that P becomes the origin; the tangent 
plane at P, the x, y-plane; and t, the x-axis. The equation of S then takes 
the form z = f(x, y), where f(0, 0) = f:(0, 0) = 0. A plane >) through 
t is given by the equation z = «y. We now introduce r = y y2 + z2 and 
x as coordinates in );; then the intersection of }, and S is given implicit- 
ly by the equation 


ra =f r | 
vVIi+a |æ V1 + a?j` 
The curvature of the curve of intersection at the point x = 0, r = 0 is 
therefore (cf. p. 232) given by 


I4 x2 

k = fer V1i+ 0 
a 
Thus, the center of curvature of this section has the coordinates 
ea z2 -Ý _ , 

X=Y kpa frl +a) IT RVD 4 a2 fel + 2)? 

that is, it lies on the circle 
fry? + 27) — z= 0. 


7. Take the tangent plane at P as the x, y-plane. Then the equation of S 
may be taken to be z = f(x, y). A normal plane is given by the equation 
x = ay. Take r = Vx? + y2 and z as coordinates in the plane; 


=f OP Z” 
=l VIF 02? V1 + a3)’ 
and its a curvature at r = 0 by 
k = fr2(0, VFS z + 2fzy(0, OF ier: z + fu, 0) EF 23 


the final point of the vector of length 1/./z along the line t then has the 
coordinates 


ne l _ l ee 
~ VIF ENR’ T VIF ek 
that is, it lies on the conic 
xfrs + 2xyfry + Yfuy = 1. 


8. (a) By differentiating the two equations with respect to a parameter t 
of the curve, we obtain 


z = 0; 


xx + yy + zz =0, axx + byy + ezz = 0. 


From these relations we can find the ratio x’:y’:z’, that is, the di- 
rection of the tangent. If (€, n, ¢) are current coordinates, the 
equations of the tangent are 


Ex): a-y): -z= oO to Oe 
x y z 
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(b) By differentiating the equations of the curve a second time and 
using the result of (a), we obtain 


xx” + yy” + zz" — —(x’2 + y’2 + z’2) 
— 2 — fr)? — 2 
— {© by’, @—c? , (6—-a) 


x2 y? 22 


and 


— hye — rp) _ 
axx” + byy” + czz” = jae 2 b) + bla — c) + clb — a)? a, 


y2 z2 
where 4 is a factor of proportionality. Eliminating ìà, we have 


a(c = b)? 4 bla — c)? 4 c(b — 2 
x 


(xx” + yy” + z2")| z -2 


(e — by? 


— eX — aX 
= (axx” + byy” + oaz C +O? b-a l. 


y? z 


This linear equation in x”, y”, z” remains valid if we substitute x’, 
y’, 2’ for x”, y”, z”. Hence, it is still satisfied if we replace x”, y”, 2” 
by some linear combination Ax’ + ux”, Ay’ + uy”, A2’ + uz”, respec- 
tively. Now if (&, n, ¢) is in the plane, & — x, 7 — y, č — z are just such 
a linear combination (cf. Exercise 6, p. 215). 

The equation of the osculating plane is hence found to be 


axe y by? B cz? OAL 
pét Erat yeaS. 


9. Take 0 as parameter for both curves. Then with u = 9, v = ġ, set 
du/dt = dv|dr = 1, dv|dt = —1, du/dt = 1, E = a?, G = a? sin?ð in (48). 
The tangents of the curves are given in coordinate vectors i, j, k by 
= a(cos 9 cos ¢ + sin 9 sin ¢)i 
+a(cos 9 sin ¢ F sin 9 cos ¢)j — a sin 0 k, 
and|X|?2 = a2(1 + sin20) in both cases. 
X = 2a(+ cos 0 sin ¢ — sin 9 cos ¢)i 
+ 2a(F cos 6 cos ¢ — sin 6 sin ¢)j 
— a cos 9 k. 
Apply the formula of Section 2.5 Exercise 8. 


Exercises 3.4b (p. 289) 


1. The mapping is conformal everywhere except at u = v = 0 because the 
Cauchy-Riemann equations are satisfied. At the origin all first deriva- 
tives vanish. In polar coordinates u = r cos 9, v = r sin 0 the mapping 
becomes x = r? cos 20, y = r? sin 20; thus, at the origin, all angles are 
doubled. 
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2. Whenever it is defined; that is, everywhere except on the line u = 0. 
3. Verify the Cauchy-Riemann equations with p = x% — yn, q = xn + ¥5, 
ap 9 ,,9x om | oy 
du ~ Ou + 6 Ou Y Ju 3u 
— 0, c9¥ 2E | Ox _ Og 
= eat Sa tant ay > a" 
4. (a) From (40f) it follows that Xu ° Xu = Xo ° Xo, = 4r4/(u? + v + r?) 
and Xx, « X, = 0. Set E = Gand F = 0 in (48) to obtain the desired 
result. 


(b) A circle on the sphere is the intersection of the sphere with a plane, 
say P. If the plane P passes through the north pole, stereographic 
projection maps the circle onto the intersection line of P with the 
x, y-plane. More generally, if P has the equation ax + by + cz = d, 
then, from (40f), 


(c — d) (u? + v?) + 2ar2u + 2br?v = re(cr + d), 


which is the equation of a line if c = d and a circle if c + d. 


(c) From (40f) 
u= «(1-4 ; v=y(1- 3} 


Reflection in the equatorial plane yields the transformation (u, v)— 
(&, n), where 


—~_* .,— Jy 
1+2j/r’" 1+ zr’ 


Substituting for x and z from (40f), we find 


2 


E = r?u n= r2u l 
u? + v? u? + v? 
which are the equations of inversion in a circle of radius r. 

(d) From the result of part (a), 
_ 4r4 

(uz + v2 + r2)2 
5. The angle given by (48) must satisfy 

du/dt du/d< + du/dt du/dt 

v¥((du/dt)? + (du/dt)*] [(du/ds)? + (du/dt)?] 


Taking orthogonal pairs of vectors (du/dt, dv/dt) = (0, 1) and (du/dr, 
dvu/dz) = (1, 0) yields F = 0. Similarly, the pair (1, 1), (1, —1) yields E = 
G. If E and G are not 0, the conditions 


E=G, F=0 


ds? (du? + dv?). 


COS © = 


are sufficient. 
6. From the solution of Exercise 5, we require 
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E = sin?¢ = ¢’2 = G. 
Solving the equation ¢’ = sin ¢, we obtain 


v = log tan f or ọġ = 2 arc tan e. 


Exercises 3.5a (p. 292) 


1. (a) A family of similar ellipses centered at the origin with axes aligned 
with the coordinate axes. 


(b) The family of circles tangent to the x-axis with centers on the y-axis. 
(c) Not a family. Each value of c yields the same curve, the unit circle 
x2 + y2 = 1. 
2. The spheres of radius 1 with centers on the line 


x=y—1=5@+ VD. 
Exercises 3.5b (p. 295) 


1. No. For example, consider the normals to a straight line or circle. 
2. An envelope satisfies the parametric equations 


x= —v(c), y = —eb'(c) + Xe). 


If y has an inverse ¢, we may set ¢(—x) = (V’)-\(—x) and use c = 
¢(—x) to obtain the nonparametric equation 


y = xo(—x) + VO(—)), 
from which 
y = $(—x) — x’¢'(—x) — V(G(—x)) (—x) 
= ġ(— x). 
Entering c = ¢(—x) = y’ in the expression for y, we obtain the desired 
result. 


Exercises 3.5c (p. 302) 


1. (a) Eliminate t to obtain 
— g 
y = x tan « — ape x?(1 + tan?a). 
Let c = tan « be the parameter of the family: 
— (1 + c?) 
(a) y = cx— oy gx?. 


The envelope has the equation 
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(b) For a fixed x, dy/de = x — cgx?/v? and d?y/dc? = —gx2/v? < 0. 
Since dy/dc = 0 on the envelope we conclude that for a given x the 
point on the envelope is the highest reachable target. 


(c) For (x, y) with y below the maximum, the quadratic equation («) has 
two solutions for c. 


2. (a) The parabola y? = 4x. 
(b) The straight lines x = + 2y. 
(c) The hyperbolas xy = +}. 
(d) The straight lines y = +ax. 
3. Let the equation of the curve be given parametrically by x = ¢(), 
y = b(t). The envelope of the family of circles satisfies 
[x — 9P + [y — ¥@)]? = p? 


and 


[x — $6) + [y — YO] YO = 0. 


These are precisely the conditions that (x, y) lie at the distance p from 
the point (¢(¢), 4(t)) in a normal direction. 


4, We may introduce ¢ as parameter on the curve, so that the latter is given 
by x = x(t), y = y(t), z = z(t) and the tangent at the point with param- 
eter t lies in the two planes corresponding to t; this gives the relations 


ax’ + by’+ c2’=0, dx’ + ey’ + fz =0. 


By differentiating the equations of the straight lines with respect to t, 
we thus obtain 


ax+b0y+ez=0, d@xt+eyt+f’z=0. 
With the relation 
ax + by + cz = dx + ey + fz 


we then have three homogeneous equations in x, y, z, and the determi- 
nant must vanish. 


5. (a) The parametric equations for C’ with t as parameter are defined by 
the equations 


Ex + ny =1, Ex’ + ny’ = 0. 
Taking the ordinary derivative in the first equation with respect to 
t, we find, in view of the second equation, 
Ex + ny =O. 

This, coupled with the first equation, defines the polar reciprocal of 
C’ which is clearly the curve C. 

(b) E(1 — a?) + n?(1 — b?) — 2abën + 2a% + 2by = 1: 

(c) a?% + bn? = 1. 

6. The equation of the generating tangent is 


x sin 9 + y cos 0 = a(6 sin 8 + cos 0 — 1). 
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7. If (x?/a?) + (y?/b2) = 1 is the equation of the conic, then (x? + y?)?= 
4(a?x? + b?y?) is the equation of the envelope. Note that if the conic 
is a rectangular hyperbola, this envelope is an ordinary lemniscate 
(x? + y?)? = 4a?(x? — y?), 

8. (a) IfT is given parametrically by the vector equation X = ®(t), the 

points Y of the pedal curve are defined by the conditions 


(Y—X)-Y=0, Y-xX’=0, 


A point Z on the circle must satisfy (Z — 4X)? = 4X? or Z2 — Ze X 
= 0. To be on the envelope, then, Z must satisfy Z e X’=0. These 
are the conditions that Z be on the pedal curve. 

(b) From the original definition of pedal curve, a cardioid r= a(1-+ cos 9), 
where a is the radius of the circle and 9 is the azimuth with 
respect to the direction of the center from 0. 

9. If the ellipse has equation (x?/a?) + (y2/b2) = 1, the envelope is the 
ellipse with equation 

u? v 1 

b2(a2 + b2) © be 


Exercises 3.5d (p. 306) 


1. These are ellipsoids (x?/a?) + (y2/b?) + (z2/c?) = 1, with abc = k, where k 
is fixed. The envelope is xyz = k?/3,/27. 
2. These are planes with unit distance from 0. Envelope, the unit sphere 
xe+ y+ z= 1, 
3. (a) Vx tvy+vz=1. 
(b) x213 + y213 + 22/3 = 1. 
4. For the envelope we have the two equations 
xcost+ysint+z=t 
—x sin t + ycost=1. 
These two equations give a family of straight lines with parameter t; 
if a curve having these lines as tangents exists, it must also satisfy the 
equations obtained by differentiating once again. 
(a) r sin [z+ vrz? —1— 9] +1=0. 
(b) The curve is given by z = 9 — r/2, r= 1. 
5. Let P (x. y, z) be a point on the tube-surface 2, and let S be the sphere 
of the family that has the point P in common with %. Then S and È 
have the same tangent plane at P, that is, the same values of x, y, Z, Zz, 


zy at that point. It is therefore sufficient to prove that the relation is true 
for any sphere of unit radius that has its center in the x, y-plane, that 
is, for u(x, y) = V1 — (x — a)? — (y — b’. 

6. Use inversion. Since Sı, S2, Ss pass through the origin, they are trans- 
formed into planes; we have then merely to find the envelope of the 
spheres touching three planes (i.e., a certain circular cone), which we 
reinvert: 
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(x2 + y2 + 22)? — 2(x? + y? + 27) (x +y +2) 
— 3(x2 + y? + z? — 2xy — 2xz — 2yz) = 0. 


. (a) If P describes the pedal curve I” of I’, construct on OP as diameter 


a circle in the plane perpendicular to the plane of T; the envelope 
is the surface generated by this variable circle. 


(b) See the solutions of part (a) and Exercise 8(b) of section 3.5c. 


. This is the family (x/a) + (y/b) + (2/c) = 1, with abc = k. The envelope 


is defined by these equations together with 


_ x ze, Y zk _ 
a t ab 9? Tiet eae O 
which yield, with the first equation x/a = y/b = 2z/c = 3, whence, xyz = 
k/27. 


. Such a plane must contain the tangent vectors Tı = (a, 1, 0) at the 


point (a2, 2a, 0) of the first parabola and T2 = (b, 0, 1) at the point (b?, 0, 
2b) of the second. The condition that the tangents intersect yields b = 
+ a, with the intersection point (—a?, 0, 0). Using Ti x Tz = (1, —a, — b) 
as a normal to the plane, we then obtain its equation in the form 
x — aly + z) + a? = 0, witha as parameter and, as an envelope, the para- 
bolic cylinders 4x = (y + z)?. 


Exercises 3.6a (p. 310) 


l. 


(a) — sin v. 
(b) (a8 + b3 + c3) (u — v) + 3abcv. 
(c) 4uv. 


Exercises 3.6b (p. 312) 


1. 


(a) — 2xy dx dy. 
(b) (x4 — 4x?y? + y*) dx dy. 
(c) (a? + b?) dx dy dz. 


. Forw = A dx + B dy + C dz, 


w2 = A2 dx dx + B? dy dy + C? dz dz 
+ AB(dx dy + dy dx) 
+ BC(dydz + dz dy) 
+ CA(dz dx + dx dz) 


and each term in w? clearly vanishes. 
Alternatively, since we know for any two such forms that wiw: = 
— wow, it follows that w? = —w?; hence, w? = 0 


. Use the result of Exercise 2. 
. Rewrite the left side in the form 
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Kor + ws) + (we + @4)] [(a1 + 3) — (2 + 4)] 
and apply the result of Exercise 3. 


5. Li(Lels) = (A1 dx + Bi dy + Ci dz) | ý o dy dz 
C2 Cs Az As 
+ Az As dede+| p Bs dx dy| 
_ fa: B2 Bs LB C2 C3 G Az As ax dy dz, 
C2 Cs Az As Be Bs 
where the coefficient of dx dy dz is the expansion in minors of the first 
Ai BG 
row for the determinant | Az Bz C2}. 
As Bs Cs 
Exercises 3.6c (p. 316) 
1. (a) — ary + erp” 
(b) 2 dx dy 
(c) 0 
(d) x (cos y — 1) sin z 
(e) 0. 


2, For wi = Ai dx + Bi dy + Ci dz, (i = 1, 2), 


d(o12) = ites Ce + Bi 7 C —- Be—C rd 


' ea, a+ 61242 74s 0, a 2G) 


3y oy 
+ [7 Bit A: , Ba TZ As — Bi Se) }ax dy dz 

_ f[?C1 _ Br 3A _ aC, 
=|(5 Ta) Ae + (5, Je) B 

+ ae S| c dx dy dz 

Ox 
+ B2 C2 B (52 _ 2) 
Ai (5 02 ral Ox dz 
+6 (5 a dy dz 


= (dw1)m2 + w1(des). 
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3. From Exercise 2, if dw1 = dws = 0, then d(wime) = 0. 


Exercises 3.6d (p. 325) 


1. Considering F(X) = f(e, ¢, 0) = g(x, y, z) as a function of a point in 


space, we know from the invariance of the differential form that 


dF = dg = É de + GE dy + 2 de 


=vF'+dX 
=L de + 3f ag + Fae 
Consequently, 
vF + dX = (uiay -iga w] adx, 
whence 
=lat ly 4 L y 


do op osin 6 30 


Exercises 3.7b (p. 329) 


1. 


Q OF AUN 


(a) Saddles at y = 0, x = 2/3 + 2nr; minima at y = 0, x = —x/3 + 2nr. 

(b) Maxima at x= n/4 + 2nr, y=n/4+2nn, and x = 37/4 + 2nr, 
y = 3n/4 + 2nx; minima at x = n/4 + 2nn, y = 37/4 + 2nr, and 
x = 3n/4 + 2nr, y = x4 + 2nr. 

(c) Saddle at x = 0, y = 1. 

(d) No stationary points. 

(e) Saddle at x = 0, y = 0. 


. Maxima for x = 0, y = +1; minimum for x = y = 0. 

. Minimum for x = 1, y = 4, saddle point for x = —1, y = 2. 

. a/20, a/10, ajlo. 

. Improper minima on the planes x = 0, y = 1, z = —4. 

. Maximize V = xy[100 — 2(x + y)]. Maximum volume for x = y = 50/3, 


z = 100/3; Vmax = (25/27) x 104 in? = 5.4 ft3. 


. Set X = (x, y, z) and let the n points be (ai, bi, c:),where i = 1,2,...,n. 


To minimize =[(x — ai)? + (y — bi)? + (z — c:)?], set 
2X(x — ai) = 2X(y — bi) = 2X(z2 — ci) = 0 


Hence, x = (1/n) Zai, y = (1/n) Xbi, z = (1/n) Xc:. The sum is minimized 
at the center of gravity of the n points. 


Exercises 3.7c (p. 334) 


1. 


Take 
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oo 


F(x, y, z) = xyz + A[2(x + y) + z — 100]. 
From 
Fz: = yz + 2), Fy = zx + 20, Fz = xy +2, 
the extremum occurs when 
V = xyz = —2Ax = —2Ay = —2z. 


Thus, z = 2x = 2y. Entering this in the subsidiary condition, we obtain 
z = 100/3, x = y = 50/3, as before. 


x= y=4,2=% 
. x= —y = 1v2, z= 1. 
. Take the center of gravity of the n points as the origin and let their 


coordinates be (ai, b:). Set X = (x, y) and let the line be given by Ax + By 
= C. Applying the method of Lagrange multipliers to 


X[(x — ai)? + (y — bi)?] + (C — Ax — By), 


we obtain 
2nx — A = 2ny —~AB=0; 
whence, 
4 — _2ne 
A? + B?’ 
Thus, 
AC BC 


X = A2 F B2’ Y T Aep Be? 


that is, X is the nearest point on the line to the center of gravity. 


. Let S denote the curve f(x, y) = C and S’ the curve ¢(x, y) = C’. S and 


S’ have a point of contact in (a, b). In general, f(x, y) — C is positive on 
one side of S and negative on the other side in some neighborhood; 
similarly, with ¢(x, y) — C’ and S’. If, for example, f (a, b) is a maximum 
of f, then f(x, y) — C <0 on & i.e., S’ is wholly on one side of S, then 
S is also on one side of S’. That is, ¢(x, y)— C’ has a constant sign 
on S, and as it is equal to 0 at (a, b), it has either a maximum or a mini- 
mum there. 


Exercises 3.7e (p. 340) 


1. 


2. 


For smooth f and ¢, the minimum c characterizes a level surface f(x, y, z) 
= c tangent to the surface ¢(x, y, z) = 0. 

Find a point on the intersection of the two cylinders ¢(x, y) = 0 and 
(y, z) = 0 where f(x, y, z) is an extremum. Assuming fis smooth and 
the intersection is a smooth curve, this occurs where a level surface 
of f touches the curve. 
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Exercises 3.7f (p. 344) 


1. 


Extremize 
(x — a)? + (y — b} + (z — c) + A/D — Ax — By — Cz) 
to obtain the conditions 
2(x — a) —AA = Ay — b) —AB = AZ—c) —aC = 0, 
whence 
, — 20D — aA — bB — cC) 
A? + B? + C2 

This yields 
A(D — aA — bB — cC) 

A? + B2 + C? ye 
and the minimum distance p is given by 
p= |D — aA — bB—cC| l 

VA? + B? + C? 


x=a+ 


- (4 + V5)/V2, (4 — V5)/72. 


. The maximum value is the same as for the expression ax? + 2bxy + cy? 


subject to the subsidiary condition ex? + 2fxy + gy? = 1. 


. Cf. Exercise 3. 


(a) 14/3 + 2767/3. 
(b) The function has a non-strict maximum (p. 325) equal to 1.95, 
when y/x = 0.64. 


. The ellipse obviously touches the circle; that is, the two equations 


must give a double root in x. Hence, the condition for contact is 
a?(b? — 1) = bt: a=3/,/2, b = V3/2. 


. (—1/714, —2//14, —3//14). This is on the line joining the given point 


to the center. 


. A= a’/x, B= b?/y, C = c?/z, together with the subsidiary condition 


(x?/a?) + (y?/b?) + (2?/c?) = 1: 


Bi a43 
(8) x= [que pe} GB 
q3l2 
b SOO eo e o 
(b) x va+b+ce’ 


. The vertices are given by x = + a/V3, y = +b/V73, z = c|v3. 
. The vertices are given by x = a?| Va? + 62, y = b?| Va? + 62. 
10. 
11. 


x=1,y=1. 
The greatest axis is given by the maximum of vx? + y? + 22, with the 


subsidiary condition that (x, y, z) lies on the ellipsoid. Hence, we have 
the three equations 
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z F = Max + dy tez),.... 


Multiplying these by (x, y, z), respectively, and adding, we have A = 
vx? + y2 + 22 =L On the other hand, we may regard the equations 
as three linear homogeneous equations in x, y, z whose determinant 
must vanish. 


12. (a) Equivalently, maximize 
a log x + b log y + c log z + A(1 — xt — yk — 2°). 
This yields 


b c 
yxk — 2% ayk? k— Ê. 
x RB’? p k’ 
whence, 
a=7@+b+o). 
The maximum is attained when 
k= I _ r—— b | k— — ¢ 
x at+b+c’” at+b+e’” a+b+e 


a® b? ce 
(a + b + c)atote ° 
(b) Set xk = uf(ut+u+w), y! = v|(u + v + w), z% = w(u + v + w) in 


ayboc\e <__ be 

(xay?z*) Say b + c)jatote' 

13. Compare the similar proof for triangles on p. 328. A minimum point 0 
does exist. First show that if 0 is not one of the vertices, then it can only 
be the point of intersection of the diagonals. Use the fact that the final 
points of four unit vectors whose vector sum is 0 form a rectangle. 
Then prove that the sum of the distances from the vertices is less for 
the point of intersection of the diagonals than for any of the vertices. 

14. Suppose the pairs a, b and c, d are adjacent. Let ¢ be the angle between 
a and b, | that between c and d. The problem is to maximize 


Al, 4) = s(ab sin ¢ + cd sin 4) 


and is equal to * J 


subject to 
f (¢, Y) = (a2 + b? — 2ab cos ¢) — (c? + d? — 2cd cos 4) = 0. 


Setting the respective derivatives (0/0¢) (A + Af) and (3/34) (A + Af) 
equal to 0 we obtain 


S _ 1 
4tang@ 4tanb’ 


whence ¢ + | = x. Consequently, 


A= (ab + cd) sin ¢, 
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where cos ¢ = 3(a2 + b? — c? — d?)/(ab + cd). Eliminating ¢, we obtain 
the maximum area 


A=? 4 (ab + cd)? — (a? + b? — c? — b?)? 


lme Aj 


=i eabed —@ F F F FÈ, 
which is clearly independent of our assumption concerning the order 
of the sides. 

The conclusion that the maximum is independent of the order of the 
sides is geometrically obvious since any pair of adjacent sides may be 
interchanged without affecting the area of a convex polygon. 


Exercises A.1 (p. 350) 


1. (a) Minimum at the origin. 


(b) For simplicity, introduce new variables u = x -+ y, v = x — y. We 
seek extreme values of 


f(u, v) = cos u + sin v + iu + v}. 


The conditions fu = fy = 0 yield (i) cos v = — sin u = — 3(u + v). 
We must entertain two possibilities: 
1. sin v = — cos u. In this case 


fuv? — fuufer = cos?u 


and only saddles are found. 

2. sin v = cos u. In this case, (i) yields u + v = — 7/2, we may have 
either u = —« or u =x + «. In the former case, fuv? — fuufov = 
cos u (1 — cos w) is positive and we obtain a saddle; in the latter case, 
it is negative and we obtain a minimum from fuu = fov = cosa + }ż. 


(c) No extreme, since fz > 0 everywhere. 


2. f(x) + f(y) + F(z) 


= 3f(a) + {æ — a) + (y — a) + (z — a)} f'(a) + 50°F" @+4, 
where oe? = (x — a)? + (y — a)? + (z — a)?. On the other hand, the 
subsidiary condition gives 
(x — a) + (y — a) + (z — a) 
— 2(/ ¢@M, \)_ #@ nn _ 
=e (-39@) +) ga 90-9 
+ (x — a) (z — a) + (y— a) (z — a) 
_(_¢%#@,9¢@,, 
= |- Fotot] 


where lim :=0. 
L.Y.2-a 
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3. If Pi = (%, yi), re = PPi, we have 
3 3 
d?f = 2 d?ri = 2 ri 3[(y — ya)dx — (x — xi)dy]? 
i= 


which is positive definite. 

4. At the point Pi. Note that the function f = rı + re + rs is continuous 
in the whole plane but not differentiable at the points Pı, P2, Ps, where 
it has conical points (like the function z = V(x — x1)? + (y — yi)?, which 
geometrically represents a circular cone). Investigate the derivative of 
f at P in all directions around this point. 

5. (a) If we put f= lx + my + nz, ġ = x? + yP? + z? — er, F=f —¢9, 

then the conditions for stationary values are 


(1) l = Apx?-1, m = Apy®-1, n = àpz?-t 

Multiplying these equations by x, y, z, respectively, and adding, we 
have 

(2) lx + my + nz = dpe?. 


Calculating x, y, z from (1) and substituting in ¢ = 0, we get 
Ap = (14 + m2 + niac! —, 


Substitution of this expression for Ap in (2) gives the stationary 
value. 


(b) Cf. Exercise 6. Here we have 
d?F = —dp(p — 1) (x7 dx? + yr- dy? + 2P-2 dz?); 


as p > 0, this quadratic form is positive or negative definite ac- 
cording to whether p 2 1. 
6. The proof resembles that for n = 2 (p. 347). A positive definite quad- 
ratic form )\airnxixxe can be brought by a suitable transformation 


n 
Xi = 2s Civ (i=1,...,n) 


with a nonvanishing determinant into the form }đikxixk = yı? + 
yo? + see + yn? > m(x12 + +++ + xn?), where m is a suitable positive con- 
stant. For the applications, it is important to remember that a 
necessary and sufficient condition that a form ® = )laixxixz shall be 
positive definite is that its principal first minors of order 1, 2,..., 7, 
as indicated below, 


a21 G22: a23 : 
a31 a32 33 : 
Anl CESFCS™HOKEOSHeseeetseeneeoee Ann 


shall all be positive. ® is negative definite if —® is positive definite. 
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7. According to the first rule, we have to compute d?f from (3), with dx1, 
..., Xm, d?x1ı, . . . , d2xm substituted from (1). Note that(1) implies 
that 


depu = D> Puzizk dx, dxXk + Puzi d2xı ooo Pusrm d2xm 
= 0 (u=1,...,m); 


if this is multiplied by àu and added to (3) for all values of u, we have 
d?f = d?F = X Friz dxidxx, because d?x,..., d*xm drop out on 
account of the relations (2). 

8. For F = f + 4¢ (disregarding a positive factor), we get 


@r= } dxdxr (dd = dx1 +++ dxn = 0). 
ee) 


Eliminating dxn, we have to show that the quadratic form 


—d?F = (dx ++ dxn-1)? — | ; 2 ax Ax 
l, = 


=1,...n 


' dxi dice 


= B d+ a 


n- 
i=1,. n i,k 


is positive definite. 
9. From dx = —dy — dz, 


d?F = —2s[(s — z)dy? + (s — x)dy dz + (s — y)dz?]. 


When x = y = z the discriminant of d?F is positive and d?F is negative 
definite. 


Exercises A.2 (p. 359) 


1. (c) Using polar coordinates x = r cos 0, y = r sin 9, take 
f(x, y) = r”+1 sin (n + 1)9, 
for which 
vf = (n + 1)r” (sin n9, cos n8). 
2. (b) Extend the solution of Exercise 1: 
f(x, y) = r-™*! sin(—n + 1)0 
and 
vf = (n — 1)r-* (sin nð, —cos n9). 


3. If there is no fixed point, we have u? + v2 + 0 everywhere in R. Since the 
convex region R is simply connected, it follows as on p. 358 that the 
index Ic of the curve C with respect to the vector field is zero. On the 
other hand, since R is mapped into itself, the vector (u, v) for every point 
on C points into R or is tangential. This implies that Ic = 1/27 fc dð = 
1 if C has the usual orientation determined by the x, y-coordinate 
system. 
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Exercises A.3 (p. 362) 


1. (a) A node at (0, 0), with tangents x = +y. 


(b) The equations 
fz = 2x — 6x2 + 4xy? = 0, 
fy = 2y — 6y? + 4x*y = 0 


have the common solutions (0, 0), (73, 0), (0, v3), Ġ, 4), and (1, 1), 
of which only the first and last are points of the curve. At (0, 0) 
the singularity is an isolated point. At (1, 1), fez = fyy = 0 and fry = 
8; the singularity is a node with tangents x = landy = 1. 

(c) A double tangent y = x at (0, 0). The curve has two branches; to 
second order y = x + x? 

(d) A double tangent y = 0 at (0, 0). The curve has a cusp. This is the 
same curve as that of Section 3.2b, Exercise 3. 


Exercises A.4 (p. 363) 


l. 


If the quadratic form is nondegenerate and definite, the singularity is an 
isolated point; if nondegenerate and indefinite, the tangent lines at the 
singularity form a cone. If the form is degenerate and semidefinite, the 
tangent lines may lie in a plane where two branches are tangent to 
each other, like the plane z = 0 for the surfaces 


z213 4. (42 + y2)2/3 = g2/3 
at (a, 0, 0) (a line cusp), 
z4 = (x2 + y2)8 


at (0, 0, 0) (two tangent branches). Or there may be a point cusp where 
only one tangent line exists, like the line x = y= 0 for the former 
surface at (0, 0, a). If the form is degenerate and indefinite, the tangent 
lines lie in two planes, like the planes x = + y at (0, 0, 0) for the surface 
x2 — y2 + 23 = 0. 


Exercises A.5 (p. 364) 


1. 


The flow is stationary; that is, the fluid velocity is constant in time at 
each point of space. 


2. If U = (u, v, w) is the velocity of the particle passing through the point 


X = (x, y, z) at time t, its acceleration is 


dX dU_ dX aU 
de da at Ot 


_ aU 
=UevU+ AF 
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Exercises A.6 (p. 366) 


l. 


(a) x = —2 — 2 cos a, y = —2 sin « or (x + 2)? + y? = 4; L = 4r; A = 
4r. 


(b) x = —sin? «, y = —cos? « or x? + y2/3 = 1, 
/2 
L= Sf lsin 24|da|=6 [7 sin 2x dx = 6. 


A = —(8/8)z, where the sign comes from the clockwise orientation 
of the curve. 


. Yes. Consider the right triangle with vertices (0, 0), (0, c), (e —?, 0) for 


large c. 


. For the curve to be expressible as the envelope of its tangents, it must be 


piecewise smooth. 


Exercises 4.1 (p. 374) 


1. 


In the nth subdivision, any square that contains points of S contains 
points of T, Ant(S) < Ant(T). On passing to the limit as n — œ, we 
obtain the result. 


. In the nth subdivision, any square that contains points of T — S may 


not be one that consists entirely of points of S, and both kinds of squares 
contain points of T; therefore, 


Ant(T) = Ant(T — S) + An“(S). 
Similarly, 
An*(T) < An-(T — S) + An*(S). 
Combining these results with An-(T — S) S An*(T — S), we find 
An*(T) — Ant(S) < An-(T — S) S Ant (T — S) 
< Ant(T) — An(S), 


from which the result follows on passing to the limit as n — oo, 


. For the proof of (a), observe that any square of the nth subdivision 


that enters in Ant(S) or Ant(7’) may enter in only one or in both of 
these; if a square enters into only one, it enters in Ant (S U T); if it enters 
in both, it enters in Ant(SUT); but need not enter in Ant(Sf T), be- 
cause the square may contain points of both S and T without containing 
points common to the two, Consequently, 


Ant(SUT) + Ant(SN T) < An*t(S) + An*(T), 


from which (a) follows. 

For (b) we observe that any square that enters in one sum but not the 
other, say, An ~(S) but not An-(T), will enter in An (SUT) but not 
An (SN T) and any square that enters in both An- (S) and An- (T) also 
enters in both An” (SN T) and An (SUT). Thus, 
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Ån (S) + Ån (T) < Ån (S N T) + Án (S U T), 


from which (b) follows. 

Note that a square consisting of points of SUT need not consist 
wholly of points of S or wholly of those of T; consequently, the inequality 
sign can not be removed. 

4. In the nth subdivision, consider any square that consists entirely of 
points of SU T. If it contains any point of S, the square enters in An*(S), 
but it cannot enter in An (T), because it cannot consist wholly of 
points of T. If the square contains no points of S, it must consist wholly 
of points of T and, thus, enters in An- (T). Finally, we observe that any 
square that enters in Ant(S) but does not lie wholly in SU T must con- 
tain a boundary point of SU T and therefore enter Ant (OLS U T]). Com- 
bining these results, we find 


An (SUT) < An*(S) + An (T) S An (SUT) + AntO[SU T)). 
Since lim An- (SU T) = A (SU T) and lim Ant (O[S U T]) = 0, the desired 
n-o nr 


result follows. 


5. (a) Let Jordan content in the original system be denoted by A, and in 
the transformed system, by B. Since A(@S) = 0, lim An*t(OS) = 0. 


Let P be any point of 0S. Note that in the nth subdivision, the 
maximum distance from P of any point of a square that contains 
P is 2” y2. Now, in the nth subdivision with respect to the new 
coordinate system, let Rg be any square containing P. Form a larger 
square Rg* with Re at its center and five subdivision squares on a 
side. The smallest distance from any point of Rg to the boundary of 
Rp* is 2 - 2-". Thus, Rg* contains each square Ra that contains P 
in the subdivision with respect to the original system. We conclude 
that for each square that enters into A,* (ôS) no more than 25 
squares enter Bnt(@S). Since 0 < Brt(@S) < An*(@S), it follows 
that lim Bnt(eS) = 0. 


n= 

(b) Observe that in the nth subdivision with respect to the two systems, 
any square that enters in An-(S) is covered by squares that enter into 
Bn*(S). It follows that An-(S) < Bnt(S) and, passing to the limit 
as n— oo, A(S) < B(S). By a parallel argument, B(S) < A(S). 
Consequently, A(S) = B(S). 

The foregoing argument makes tacit use of the assumption that 
if two sets U and V are made up of nonoverlapping congruent 
squares from respective grids and Uc V, then the number of 
squares in U is less than, or equal to, the number of squares in V. 
We prove this inductively as follows: Let u and v be two finite col- 
lections of nonoverlapping squares of side length a from respective 
grids such that the union U of squares of u is contained in the union 
V of squares of v. If p is the number of squares of u, and q, the number 
of squares of v, then p <q and equality holds if and only if u = v. 
For the proof, we use induction on p. 

If p = 1, we cannot have q < p; for, then, g = 0 and V does not 
contain U. Moreover, if q = p = 1, we note that opposite vertices of 
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the square of u must be opposite vertices of the square of v, since the 
maximum distance av? between any two points of either square is 
attained only at opposite vertices. Consequently, the squares are the 
same and u = v. | 

Now we prove that the truth of the hypothesis for a fixed p implies 
its truth for p + 1: Let u be a collection of p + 1 squares and let 
u* be any subcollection of p squares. Suppose q < p + 1. Since V > 
U > U*,q = p by the induction hypothesis. However, p <q<p+l1 
implies q = p, and hence, by the induction hypothesis, v = u*. 
But, then V cannot contain the one square of u that does not belong 
to u*, contradicting that V > U. We conclude that q 2 p +4 1. If 
equality holds, q = p + 1, we now show that v = u. We shall show 
that the set U(= V) must have a corner on the boundary; that is, at 
least one of the squares R of u must have a vertex with its adjacent 
edges on the boundary of U. The square R must also belong to v, as 
we shall prove. By the induction hypothesis, the collections u* and 
v*, obtained from u and v by deleting R, must be the same. Conse- 
quently, u = v. 

To prove that U has a corner, let P be any point of U most distant 
from an arbitrary given point Q. The point P must lie on the bound- 
ary of U, otherwise it would be an interior point and its neighbor- 
hood within U would contain points more distant from Q. Further- 
more, P must be a vertex of one of the squares of u, because if it were 
an inner point of an edge, at least one of the two vertices on the edge 
would be farther from Q than P, since it would be farther than P 
from the perpendicular from Q to the line of the edge. No two edges 
meeting at P can be aligned, for the same argument shows that one 
of the end points of the segment made up of the two edges must be 
more remote from Q than P. It follows that P and its adjacent edges 
can belong to only one square R of u. (The figure shows all possible 
configurations in the neighborhood of a boundary vertex.) Exactly 
the same argument applies to v, but then, R must belong to v, as 
claimed. 


6. If P is a boundary point of S, it is either a point of S and covered or a 
limit point of S such that every deleted neighborhood of P contains 
infinitely many points of S. Thus, P is the limit of a convergent sequence 
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of distinct points of S. Since the collection of covering sets is finite, at 
least one of these sets must contain a subsequence, and because this 
set is closed, it must contain the limit of the subsequence, P. 


7. The area of the set is zero. Let Sn be the set of points for which both p 
and q are greater than n and Tx the set for which either p or q is equal 
to k. 


S=SnrnU TU TeU-+-+U Th. 
Note that Sn is contained in the square 


1 1 
< L < on G 


Consequently, 


1, 1\2 
+ n Z —- 


Observe also that Tr contains 2k — 1 points, each of which may lie in 
no more than four squares of the nth subdivision. Consequently, 


4(2k — 1) 


Ant (Tr) < pan 


Summing, we see that 
An* (S) S Ant (Sn) + X Ant (Tr) 
1 1\? , 4n?. 


whence, lim An* (S) = 0. 
n—>æ 


Exercises 4.6 (p. 405) 


1. (a) a?b? (a? — b?)/8. 


(b) —4. 
(c) log 2. 
(d) —a + (e% — 1)/b. 
(e) 7/16. 
(£) 4/3. 
2. 7/2 
3. 0. 
4. 27. 
5. Use polar coordinates: 
TIA fv cos 20 r m 1 
(a) yn f Qa ree 1 Pp dr dé = 173 


TI3 7. /3/cos(Q—7/6) r _ V3 1 
(b) Í f agri 08 = -g atang. 


14. 
15. 


16. 


17. 


18. 


Solutions 887 


. Use the substitution x = af, y = by, z = cl; then use polar coordi- 


nates and symmetry to obtain 
8a?2b2c? pepe. f 0 cos ¢ sin ¢ sin*0 cos 8 de d¢ dé 
o Jo Jo 


_ a@a?b?e? 
= 


. Use the fact that the figure is symmetrical; 1/16 of the volume lies 


above the triangle with vertices (0, 0), (1, 0), (1, 1) and below the surface 
x? + z? = 1; 16/3. 


. T (2r3 — 3r? h + h’). 

. 0. 

. 0. With the additional restriction z = 0; x/8. 

. 1/50,400. 

. Use cylindrical coordinates and integrate with respect to 0, r, and z 


in that order; r[2 — (3/2) log 3]. 


. Use spherical coordinates with origin at (0, 0, 4). With «= 


cos! [p —(3/4¢)] for ; <p < 3/2, 


3/2 pa pan 1/2 pr pon, 
fia J, J, + J, J, J, sin 6 dọ dO dọ 
= |2 +5 log 3l. 


Use polar coordinates: 4 log (1 + v2). 


Let (a, b) be any point of the domain and choose a 8-neighborhood 
R; of (a, b) within D so small that | f(x, y) — f(a, b)| < £ in the neighbor- 
hood. By the mean value theorem, 


Ss f(x, y) dx dy = pd", 


where|u — f(a, b)| < £. Since the integral vanishes, u = 0. Consequent- 
ly, | f(a, b)| < e for arbitrary positive £, and hence, f(a, b) = 0. 


Using d(x, y)/d(u, v) = u/(1 + v?), we obtain 


œ pula --(u2+a2) 
e-?+u?) dx dy = f Í e" U dv du 
J, y 0 —uja 1 + v2 


co u 
= 2e-7a? f ue-¥? arc tan z du. 
0 


Integration by parts yields the result. 

Set p? = & +77. From & = n? — &, by = — 26n, nz = — 26n, ny = 
62 — v2, it follows that | d(x, y)/d(&, n)| = 1/e4 and also that uz? + uy? = 
e*(ue? + Un). 

For new Cartesian coordinates to the same scale, the Jacobian of the 
transformation is 1. With r= (x? + y? + 2?)!/, choose Cartesian 
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~ 


coordinates u, v, w for which u = (xë + yn + 20/r. The integral 
becomes 


I= [ff cos ru du dv dw 


over the sphere u? + v? + w? <1. In cylindrical coordinates u, v = 
e cos 0, w = ọ sin 9, we find. 


r= f? 7 S e cos ru de do du 


sinr cosr 
= 4r 


r3 r2 


19. — f° 4- yf SE de dy = 16 log 2 — 12. 


Exercises 4.7 (p. 416) 
1. (a) K = lim JP f° rlog r2 dr dé. 


(b) K= [jesse pot fc Nog (x2 + y2) dy dx. 


a cos B 
2. (a) T. (b) 1. 
3. Symmetry shows that reversal of the order of integration reverses the 
sign. Since J is not zero, I = 3, the result is established. Alternately, 
for 0 < a,b <1, set 


— Q-a) (1 — b) (6—a) 
J= f. S, fee x dy 2(1 + a) (1+ b) (a+b) 


Integrating first with respect to x, then y, is equivalent to taking 


I = lim lm J= 


b-0 a-0 2 
integrating first with respect to y, then x, to taking 
. . I 
lim lim J=-=. 
a-0 6-0 2 


Exercises 4.8 (p. 430) 


1. Apply Guldin’s rule; 272ab. 
2. 4nabh?. 
3. Set x= ač, y= bn, z= ch. With d = pi Val + b2m? + c2n2, the vol- 
ume is nabc(2 — 3d + d)/3. 
4. (a) With 0 and ¢ as parameters for both surfaces, VEG — F? = 
a? sin 9. 


(b) a? f? [' a? sin 6 dg do = a? f” {1 — cos f(G)} dé. 


Solutions 889 


(c) Take f(¢) = 7/4; na2(2 — 2). 

5. Let a, b, c be the lengths of the sides opposite A, B, C respectively, and 
p the altitude from C. Apply Guldin’s rule. 
(a) ¢ncp?, 
(b) xp(a + b). 

6. įr (n — m) (4n? + 4mn + 4m? — 6n — 6m + 3). 

7. Take polar coordinates in the x, y-plane as surface parameter for the 
cylinder x? + 22 = a?. Thus, x=r cos 9, y=r sin 9, z= Va? — r? 
and E = a?/(a? — r?), F = 0, G = r?. The surface area is then 


T/4 fp bsec® ar 
S=8f f, dr dé 


Ja -— r? 
nlá b sec 0 
= —8a f Va? — r? o de 
= 2a?r — 8al, 


where 
I= J i Va? — b2 sec20 dé. 
0 


Set 0 = arc tan (v(a? — 6?)/b? sin wœ) to obtain 


à (a? — b?) cos? w 
0o a? sin? w + b2 cos? w 


[= 


? 


where tan A = b/Va? — 262. The explicit integral is 
I = a arc tan (tan o) — bo K 
b 0 


Hence, 


ob 
Va? — 26°" 


S = 8a? 7 — arc tan a |- 8ab arc tan 


Va? — 2b? 
8. L= f f VEG — F? dr dô 
0 RAGS) OO 
=h 4 J, Vr? + f dr 
a = 02 1 12 
= + 0 
[V3 + log (1 + val fa 5f” de, 


(cf. Volume I, p. 215), which is [/2 + log (1 + /2)] times the area of the 
projection 


0, <9 <02,0<r < f’). 


Exercises 4.9 (p. 442) 


1. (a) Use cylindrical coordinates. On the axis of the cone, three-fourths 
of the way from the vertex to the base. 
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10. 


(b) On the axis of the cone, two-thirds of the way from the vertex to the 
base. 


. x = 2x0/3, where y = z= 0. 
. Let (&, n, ©) be the centroid: 


_ i S o(1-4) fore) x dz dy dx, 


where V, the volume of the tetrahedron is obtained by replacing the 
integrand x by unity in the above triple integral. Integrate to 
obtain § = a2bc/24V, where V = abc/6. Hence, by algebraic sym- 
metry, & = a/4, 7 = 6/4, © = c/4. 


. (a) Use spherical coordinates, z = 3(b4 — a*)/8(b? — a3), x = y = 0. 


(b) Factor b — a out of the numerator and denominator in the solution 
of part (a) and take the limit. 


. m (b? + c?)/3. 
. If u is the density, 


(a) nuh(R? — R”), 
all ; 1 
(b) 2ruh(R — R’) F (R+ R) + Aa 


. Use spherical coordinates. Mass, ina®[uUo + 341]. Moment of inertia, 


4r a® [uo + 5u1]/45. 


. Substitute x = a&, y = bn, z = c%; use the expressions for the moments 


of inertia given in the text and the properties of symmetry of the 
ellipsoid: 


4 24 pe 
(a) 15 tabe (a? + b?), 


(b) = rabe {(1 — «2)a2 + (1 — BDD? + (1 — y2)c?}. 


. For example, with A = Jn (y2 + 22) dV, B = J, (22 + x?) dV, and C = 


f +y?) dV, 


A +B= | (x +y? + 222) dV 
=C + f 222 dV >C. 


Let (č, n, ©) be the point on the ray at distance 1/V] from O. The 
squared distance of a point (x, y, z) from the line is 
x? + y? + 22 — (Ex + ny + vzr + n? + e). 


Consequently, 


-Gtm t 
2 2 2 
I= file +y2+2 pte dx dy dz 


11. 


12. 
13. 
14. 


15. 


16. 


17. 


18. 


Solutions 891 


1 
O EEn HK 

Multiplying both sides of this equation by &2 + 72+ C2, we obtain a 
positive definite quadratic expression in &, n, č set equal to unity; hence, 
the equation is that of an ellipsoid. 
a%(x — E)? + by — n)? + ee O? 

= fa? + bË + c? + 5E + n +) (e — E+ (y— n? e O. 
(3, 0, 0) 
_ 5a 2a? + b? + c? 
~ 16 a +b+ e’ 
I= (h + mir?) + Ue + mere”), where rı and re are the distances from 
the axes through the centers of mass of the respective parts from the 
axis through the center of the system. Use mırı = mere and rı + re 
= d. 
The distance of the point (x, y, z) from the plane ux + vy + wz = —1is 
given by 


x 


ux +vy+wz+1 
Vu? + v + w? 
The moment of inertia of the ellipsoid with respect to this plane is 
therefore given by 
Au? + Bu? + Cw? + V 
u? + v + w? ? 
where A, B, C denote the moments of inertia with respect to the co- 
ordinate planesand V is the volumeoftheellipsoid,thatis, B = 4ab%c/15, 
C = 4abc?/15, and V = 4abc/3. We have now to find the envelope of 


the planes for which this expression is equal to h. The envelope is 
given by the equations 


(A — h)u = dx, (B — h)v = dry, (C — h)w = z. 


where A denotes a common multiplier, which from the expression for 
the moment of inertia and the equation of the plane is found to be V. 
By squaring the three equations we obtain the equation of the envelope, 
namely, 


x? y? z? 1 
h-A'h—-B’h—-C VW’? 
2ra2bu —— 
Jez — g2 18 l o + vb? — q2), 


where u is the constant density. 
bo . 
2ru f ; Vz? + {f(z)}2 dz — mu |b? + a?|, where the lower or upper sign 


is to be taken according as the origin is inside the body or not. 

Let X be a variable point of the solid, O its center of mass and Y a 
variable point of the space where the potential is calculated. The 
potential at Y is 
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19. 


20. 
21. 


22. 


23. 


v= (Ie 


Let a be the maximum value of |X| in S, |X| < a, and suppose | Y|> a. 
Then, if M is the mass of the solid, 


uw- m= Mera ei 2” 
«hal 
< {wavs ” 


(since || Y|—|Y— x|| <|X| by the triangle inequality) 


= JI" ayia” 


(where we suppose | Y|= 2a) 


< 20M 
~ |Y]? 
5 3 11 15/2 
As A— BR 5° 4 g BR 5° we have A= 10, B R? ° The 


attraction at an internal point is equal to the attraction of the total 
mass of the points inside of the sphere of radius r concentrated at the 
center of the sphere. 

Use cylindrical or spherical coordinates. 

By translation we can ensure that the triangle lies in the upper half- 
plane. Then its moment of inertia is equal to 


b(x191, X22) + P(x2ye, X33) + $(x3y3, x171), 


where 4(x191, x2y2) denotes the moment of inertia of the quadrilateral 
with vertices (x1, 0), (x1, y1), (x2, 0) multiplied by the sign of (xı — x2). 
Then show that 


P(x191, X22) = 4 (x — x2) (y18 + y12y2 + y1y2? + ye). 


2 Aly 
T=J, 0- ady f oug- = 12 — 16 log 2. 
Let f(e) be the potential associated with a unit point charge. The 
potential at a point (0, 0, z) in the interior ofa spherical lamina centered 
at the origin and carrying unit-charge density is 


U(z) = i f . f(p)a? sin 0 dé d¢ 


Solutions 893 


where, in the integrand, if a is the radius of the sphere, p is given by 
e = Va? + z — 2azcos 0. 
If g is a function such that g’ (e) = pf (e)/z, where z is kept constant, 
then 
U(z2) = 2rag(e) - o 


= 2ra[g(a + z) — g(a — 2)]. 
Since the force vanishes for |z|< a, we obtain 
U"(z) = 2nalg’(a + z) + g(a — z)] = 0; 
consequently, 
(a+ 2) f(a + z) = (a — 2) f(a — 2). 


This is a relation holding for all positive a and all z with |z|< a. 
Introducing new independent variables § and n with E =a + zand 7 
= a — z, we obtain 


EF(E) = nf) 


for all positive & and 7. Consequently, ef(e) = c, where c is constant. 
Thus, we conclude that 


f(r) = : (c = constant), 


which is the potential for the inverse square force law. 


Exercises 4.11 (p. 462) 
z” 


r| +2 
2 


2. I= fe |n an ° o dXn 


— X2 — » e e — Xn? 


1. Substitute xı = aıč1, . . . , Xn = QAnčn: | ) aidz °° © An. 


taken throughout the interior of the (n — 1)-dimensional unit sphere in 
x2°** Xn space. Introducing polar coordinates, we obtain 


f(1 — r2) + f(-—v r?) 
I= fa arf ee ae, 


vi-r? 
where S (r) denotes the sphere of radius r and center Oin x2° « e xn -space. 
As the integrand depends on r only, 


I= on | t v1 — r?) 
vi-r? 


rn-2 dr. 


Putting y = Vi — r2, we have 
+1 
I= ona | FOA — 9%) 9" dy, 
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3. aid2*°**an/n! 
Exercises 4.12 (p. 474) 


1. Put In(a) = f j xre-az? dx; then In(a) = —In-2(a), where primes denote 
differentiation with respect to a. Alternatively, integrate by parts. 
an 1.3. e e e (n — 1) 
2 


9 JTZ when n is even. 


2. Integrate by parts. Diverges for y < 0; for y > 0, F(y) = 0. 
3. Use the relation 


|! when n is odd, vr 


(fi cos ¢ + fy sin ¢) = fez sin? ¢ — 2fzy sin ¢ cos ¢ + fyy cos? ¢ 

1d 

+ z do 

4. Integrate uzz by parts twice (special precautions necessary in the case 
where p < 5/2). 


5. Substitute § = ax + By, n= yx + dy, where a, B, y, 5 are chosen so 
that 


(fz sin ¢ — fy cos ¢). 


E2 + n? = ax? + 2bxy + cy?. 


Then (a3 — By)? = ac — b?, and the integral is transformed into 


gp J S ce dd. 
ac — —œ0 Y =o 


ac — b? = x?, a > 0. 
6. Make the same substitution as in Exercise 5 and evaluate the resulting 
integrals, (a) using the result of Exercise 1, (b) introducing polar co- 
ordinates. 


(a) m(aC + cA + 2bB) 
(ac — b2)3/2 


2r 
(b) (ac — b2)1/2° 


7. Differentiate with respect to x and integrate by parts to obtain 


x fi _ 
= —- f V1 — £ cos xt dt. 
m Jı 


Differentiate the first of these expressions with respect to x to obtain 


1 Í 1 t2 
” = —= cos xt dt. 
Jo mJ-1 V1 — t? 
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Now combine the integral representations with the cosine factor in the 
integrand. 

8. Compare the answer to Exercise 7. 

9. (a) Forming K’(a), where the dash denotes differentiation with respect 
to a, and integrating by parts twice (taking xe-1z? as one factor), 
we have K’(a) = —K (a) /2a + K(a) /4a? ; that is, 

K(a) = Ca-}2 e-1/4a, 
where C is given by C = lim Va K(a) = lim Í e cos + dt = L Ja. 
a-o g—o J 0 va 2 
1 /x 
K(a) = 9 JZ e-1/4a, 


(b) Integrate the formula ¢/(1 + t3 = Í etz cos x dx with respect to t 


from a to b. 


1 l +a? 
2108 IF o 
(c) Substituting x = 1/t in the expression for I’(a), prove that I’ = —21, 
that is, 
I = Ces, 


where C = lim / = o e-2 dx. 
a—0 


; VT e72a, 
(d) Substitute the integral expression for Jo and change the order of 


integration. Use the formula 2 sin ax cos bxt = sin (a + bt)x + 
° sin xy 


sin(a— bt) x; cf. the expression for | dy on pp. 463. 
0 


7/2 when a > b; arc sin a/b when a < b. 


10. Set sin? ax = (1 — cos 2ax)/2. Compare Volume I, Section 3.15, p. 322; 
Exercise 8 and 9b. 

11. There exists an £ > 0 such that for every A there is an A’ > A such 
that 


fy fæ) dy) ze 
for some value of x. 
Exercises 4.13 (p. 497) 


1. (a) ic (ea —1)/,/9x T. 
(b) 1//2x (a + it). 
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(c) From 4.12, Exercise 8, Jn(x)/x" is the Fourier transform of the 
function 


n! 27 
f(x) = [yas (2n!) 
0, |x|>1. 


Consequently, by Fourier’s integral theorem f(—t)= f(t) is the 
Fourier transform of Jn(x). 


(1 — ¢2)"-1, |x|< 1 


Exercises 4.14 (p. 513) 


1. From (97b), 
r(n+3|-2 oe 30201 yr 
2 2"(2n) (2n — 2)» » 02 
which immediately yields the desired result. 
2. Form (97a), 


Tfn + 5] rí; — n| =— 7 = (-l)'r. 
2 2 . | | 
sin r|n + = 
Insert the result of (97b) to obtain 
r(i- n) = 
2 — 1le3e5oe o e(2n—1)` 
3. From (98d) 
n/2 (aq 2z-1 
B(x, x) = 2f Ein PDTT dt 
_ f" (sin s)22-1 
=f 227-1 ds (s = 2t) 
n/2 (sin s)27-1 
= 2 =d 
J s 


922x-1 


| 1 
— 1-2 = 
= 21-22 B(x, 5}. 


4. Set s = t7 in the integral to obtain 
1 1 
I= 1f s'uz-1 (1 — g)-12 ds 
x Jo 


_1 | 11 | _ 1PQ/x) T(1/2) 
~x Nx 2) x Tx ++ 1/2)" 
5. Set t = x? in the integral 

1 x2 


I= ——— 
0 V1 — x? 


dx 
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to obtain 


1 1 a+ilil 
— + {(a-1)/2 (1 _ #)-1/2 — i i 
T=5 ft (1 — t) dt = 5B 9? | 


2 
= 27 p (ZF >} 


2’ 2 
where the result of Exercise 3 is employed at the end. 
(a) For « = 2n + 1, this yields 


pagent n+ DTN +1) _ (nt)? 
r(2n + 2) ~ (Qn + 1)!" 


(b) For « = 2n, with the result of Exercise 1, we obtain 


7 — gana T(n + 1/2) T(n + 1/2) 
T(2n + 1) 


= 22n-1 E= =| [(2n)!, 


which immediately yields the desired result. 
. Set x™ = a™héE/c, y” = b™hy/c, and z = hf to obtain the volume inte- 


gral 
_ abh alm mE (1/m)—-1 »(1/m)-1 
5 Tf fE nitim-1 dY dn dé. 


T mè? 


Then, on integrating with respect to ¢ and ^, 
y = ooh Hle ty 1) — B| + 1, +4 1) 
m \c m 
— mai Blam t 2) | 
— abh "BG 41,24 1) 
c m m 
. Set x? = a2, y2? = b?n, z? = c2t to reduce the integral to 


I= ame ffi fE + n + Y) Elpl2)-1 y(a/2)-1 Yir- dE dn dt 


over the tetrahedron bounded by the coordinate planes and the plane 
E +n +%= 1. Now replace ¢ by the new variable t witht = t— £ — n 
to obtain 


r= PE EL few goma qam- q — g — new dE da dt 


= Gene f f (D-1 (t — 4) (P/2)+07/2)-1 f t play (1 — u)(r2)-1 
0 


du dn dt 
where we have put & = (t — n)u. Thus, 


898 


10. 


7 
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_ arbi" pfp r\ (it (q/2)—1 (+ — y\(p/2)+(r/2)—1 
r=% BESS FEAD- (t — ny dn dt. 


Now, setting y = tv in this, we obtain 


_ arbi" p/p r aptr (pt+qtr)/2-1 
r="? Ba) Bees 1) f f (ft wra dt, 


which immediately gives the desired result. Note the general result 
implied by the foregoing: 


J= Í i if FE + +0) E1 nB- CY-1 dé dy dt 


where the triple integral is taken in the positive octant bounded by the 
plane +n +% = 1. Many integrals can be reduced to this form, as 
seen in the following exercises. 


. Set x = af", y = bn”, z = cl” to obtain 


a ffen- n”-1 cr-i dé dn dt 
Sf fee nn) gn- dé dy dt 


where the integrals are taken over the positive octant bounded by the 
plane +7 +% <1 and have the form of the integral J in the so- 
lution of Exercise 7. Consequently, 
z= 3a I\(2n) T(8n) 
4 T(n)T(4n) ` 


x= 


. Set x = RE2/3, y = Rn?” to obtain 


T= 4 [ff x2 dx dy =9R? |f £72 1? dé dn, 


where the latter double integral is taken over the positive quadrant in 
the &, y-plane bounded by the line €+7= 1. As in Exercise 8, this 
yields 


ll 3 


4 
I= 2R*B (7, ; 


)=5 — ERA, 
As in Exercise 7, replace xo through xo = t — xı — + - + —Xn. Then, 


T a oo TE, for r Faca fwo + H wn) 


XyW-1- - + nanl dxn- » + dxr» + - dx1 dxo 


t— e e e Th t-i ee Z _ — 
=f |, É 7h me xg gmt F(t) 


t-r e o o -T 
Í 1 nl ynan (t — xı- + + —Xn)®0-1 dxn Axn-1+++ Axx 


- dxidt. 
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In the integral with respect to xn, set xn = (t — xı ° + + Xn-1)Un, which 
yields 


a e e e Tp_ 
{, 1 n xen ?n-Ut —X1°° + —Xn)%0! dxn 


1 
= (t — X1° + + —Xn-1)%0t4n t} f Un2n-! (1 — Un)%01 dun 


= (t — xı °° + —Xn-1) %0+%n-! Blan, ao). 


Iterating this procedure with xx = (t — x1- +--+ — xx-1)uzxfork=2,... 
n and x1 = tu,, we finally obtain 


I= Bian, ao) B(Qn-1, an + ao) - + - Blai, a2 +++ + + an + ao) 
f t F(Dtootat - - . an- dt, 


which immediately yields the desired result. 
11. Show that for Gn(x) defined by the expression following the limit sign in 
the right hand side of formula (86e), p. 506, 


Gan(Bx) = 5 2°Ca(a)Gu (x + 3) GOAN 


then let n — co and apply Wallis’s formula (Volume I, p. 282). 


12. (a) Set u=a—p, v=ß— q. Integrating D- f(x) repeatedly by 
parts, we obtain 


D De f(xy = Ox y... LP OOxtte™ 
D DYTO= Fae Dt t Tap) 


* Tw rw + Do e — DHP APA) dt 


Noting that the derivatives at 0 vanish and differentiating p 
times with respect to x, we then find 


GD g = DFA == D fo = D” fa). 


=i ceo (x — 1 FO) dt. 


Further integrations by parts yield 


f® (0)x” (0)x¥ o. f+- O)xuta-l 
EO = u+ Tu + g) 
+ rae aly © OAO at 


Since the derivatives of f at the origin vanish, we then find 
D™ D*f (x) = D” g(x) 


x (x — t)o-1 t (t — g)uta- -1 foto (s) 
=|, Pv) J, Tu + q) ds di 
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o p+0) (g f x — t)?-1(¢ — s)“+ta-1 dt ds. 
=ru rg fo? Of e- e-s 
We evaluate the inner integral by introducing a new variable of 
integration, z = (t — s) / (x — s) to obtain 


vra = _B(u +q, v) +q, v) 7 — gju+tvtg- + 
DD) = 0 Nia rP Í (x — s)utora-l fipta(s) ds 


= Fwd ore Jy © Irs ferro) ds. 


Now differentiating q times, we find 
(iii) D8 D*f(x) = D?D™ g(x) 
— 1 ix — +v-1 f(p+g) 


The final result is symmetric in u and v and, hence, independent 
of the order in which the operators D* and DP are applied; hence, 
D*DB8 f(x) = DBD" f(x). 

(b) Let r be the smallest integer greater thane + 8, w =r—a«-— B. 
Then (ii) yields 


D**8 f(x) = Fw) f æ 9" (© dt. 


If u +v <1, then r=p +q, w= u +v, and this integral is the 
same as that for D? D” f(x) obtained in (iii). However, if1 < u +v <S 
2, then w = u + v — l and r = p + q + 1. Now we only carry the 
expansion (i) out to the (r — 1)-th derivative, namely, 


D~ f(x) = “(æ — twtr-2 fr-D(t) dt 


so J 
(ww +r—1) Jo 
and differentiate r — 2 times with respect to x to obtain 


DD f (x) = D™* 8? f (x) 


“Fe T eroh (x — t)’ f o-DCt) dt 


=r TED Sap — ON f eo) dt. 


Thus, in this case, D°D®f(x) + D®*f (x). 


Exercises 5.2 (p. 555) 


1. (a) —b/2a282. 
(b) 0. 
(c) 0. 
4. Write d(u, v)/d(x, y) = (UVy)z — (UVz)y = curl (u grad v). 


Solutions 901 
Exercises 5.7 (p. 588) 


1. Observe that § = Xu + Xv, n = Xu — Xo. 
2. Compare the direction X, of the exterior normal with the normal di- 
rection represented by Xe x X4. 


3. (a) The line v = a/2 divides S into a portion S’ given by a/2<u<a 
(or, equivalently, by —a < v < —a/2) and oriented by § = Xu, n = 
Xv, and a portion S” given by —a/2 < v < a/2, which is just another 
Mobius band. 


(b) Si is representable in the form (40a) with v restricted to the interval 
0< vu <a. Obviously, any two points on Si can be joined by the 
curve on Sı that is the image of the line segment joining the cor- 
responding points (u, v) in the parameter plane. 


(c) Sı is oriented by § = Xu, n = Xv. 


4, One easily verifies that R(t) has length |§| and is linearly dependent on 
6, n and, hence, lies in x. Moreover, R(t) « &/|5|2 = cos t. The vector 
R(t) coincides with § for t = 0 and has the direction of ņ for a certain 
t between 0 and 180°, namely, for that t determined by the relations 


cos t = b/Vac, sin t = V1 — b?/ac. 


Exercises 5.9a (p. 602) 


1. ffas = (cat get a) fife d dy dz, 


where the volume integral is to be extended throughout the upper half of 
the ellipsoid. (The base of this half-ellipsoid contributes nothing to the 


surface integral): ia + a + <a} abe’. 


2. Since H is a homogeneous function of the fourth degree, we have 


4 [[ Has = {[@H + yHy + 2H2dS 


= [f$ as = fff dx dy dz 


=6 Í i i) [x2(2a1 + a4 + ae) + y?(2a2 + aa + as) 
+ 27(2a3 + as + ae)] dx dy dz. 
T (@ + az + as + as + as + ao). 


Exercises 5.9e (p. 610) 


1. (a) Compare Exercise 8, Section 2.4, p. 203. 


(c) Let R be an arbitrary region and v an arbitrary function vanishing 
on the boundary of R. Then, by Green’s first formula, 
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JJ J Wena + UzUz + UzgUz3) Ax1 dx2 dx3 
=— Í WS v Au dx dx dxs 


= — S v Au Vejere3 dpi dpz dps. 


Now 
Uri = up, PA ae 1+ u Up 5 spa +u Ups ps 
= upre, T Uee T Mog 
and 
Uri = = Up +v p o. + Vps o- 
Hence, 


IKAS + UztUrə + UzUz3) dx1 dxe2 dx3 
=fff Fa Up Up, + eo Up2Upe + es Ups0Ps dx1 dxe2 dx3 


={{f REZ UpUp; + Jez ~ UpUp: + (ae =? ups¥eq)| dP dpz dps 


= fff Uwr, + Uzvp, + Usvp,) dpi dpz dps, 


Ve1e2e3 
———— Upi 
ei 


where we write Ui = 


Applying Gauss’s theorem to the vector (U1v, U2v, U3v), we obtain 


wy api + ape + z v dpi dp2 dps. 


Thus, for an arbitrary v vanishing on the boundary of R we have 


ffe Au Veiezez Api dpz dps 


~ Ie ta + oo + 5p.) oP dp2 dps 


and, hence (cf. Lemma I, p. 744), 


au = (208 4 902 5 Us) 1 
Opi Ope Ops} Veie2e3 


=- f2 (J) + 2 ( mae) 4% ( fee) 
Ve1e2e3 LOP1 eı 9pi/ Ope ez Ope} ops e3 Ops 


(d) Use Exercise 9c, Section 3. 3d, p. 257: 


Solutions 908 


i (te — ti) (ts — tı) (ts — ta) Au = (ts — COFA (VEE) A 


ð ——— 0 
+ (ts — tı) — (tə) Itz (v — $(t2) a] 


+ (te — ti)V (ts) im ( v7 (ts) A , 
where ¢(x) = (a — x) (b — x) (c — x). 


Exercises 5.10a (p. 615) 


1. (a) I= — ffa, + x) dy dz, where x = V] — y? — zè. 


-f L= __13 oda — 3 
(b) I= f L= x | «y dz= 2 Jo q C08% dé = 37 


Exercises 5.10b (p. 617) 


2. If (£, n) and (x, y) are rectangular coordinates in II and P, respectively, 


then the motion of the point M (x, y) can be described by the equations 
E = x cos ġ — y sin ġ + a, n = x sin ġ + y cos ġ +b 
(i.e., by a rotation and a translation). Then 
S(M) = A(x? + y2?) + Bx + Cy + D. 
(x) If A = nr + 0, we have S(M) = nr{(x — xo)? + (y — yo)?] + S(O), 
where C is the point x = xo = —B/2nz, y = yo = —C/2nz, hence 
A, B, C, D have the values in Exercise 1. 

(B1) If A = nx = 0 but B2? + C? > 0, then 
Bx + Cy +D 
VB? + C? 

where ì =~ B2 + C2 and 4 is the line Bx + Cy + D=0. 
(82) If A = B = C = 0, we have S(M) = D = constant. 
. For the motion of the plane P rigidly attached to the connecting-rod 
AB, we have n=0, S(A)=0, S(B) = rCB?2 = ry2. Hence, A passes 
through A, and by symmetry, 4 is perpendicular to AB at A. Hence, 
S(M) = ry2l-1 d (M), where l = AB. 
. For the motion of the plane P rigidly attached to the chord AB, we 
have n = 1, S(A) = S(B) = S = area of I. The point C of Steiner’s 
theorem is therefore equidistant from A and B and S(A) = nCA?2+ S(C), 
S(M) = zM? + S(C); hence, S(A) — S(M) = area of T — area of I” 
= (CA? — CM?) = rab. 


Su = VB: F Œ = d(M), 


5. If l is the length of T, the Frenet formulae (Exercise 16, Section 2.5, 


p. 216) give 
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n, ff, _ fr: _ (X; a. 
fZ ds= |> ds = fi ds = qaz 28S = 0; 


J= ds = fx x tıds=x x bs 


' — [xx ds 


= — fë: x %ı ds=0 
6. Let n’ = (a, B, Y), x = (x, y, z). If in Gauss’s formula 
__((f [2a 3b , ae 
ffan + bB + cy) do = {ff (5 + ay + A dx dy dz, 
we substitute a = 1, b = c = 0, and a = 0, b = —z,c = y, we get 


ffedo=0 and {J or — 28) do =0, 


respectively. 


7. Take rectangular coordinates (x, y, z) such that z=0 is the free 
horizontal surface of the fluid and Oz points downward. The pressure 
on do is nz do, where z is the depth of ds. By repeated applications 
of Gauss’s formula in three dimensions, with obvious choices of the 
functions a, b, c we find for the components of the resultant of the fluid 
pressure 


JJaz do=0, [fez do=0, [fyz do = —{f dx dy dz=—V. 


For the components of the resultant moment with respect to the origin 
0 we find, again by Gauss’s formula, 


Sf ozy — 2°8)dco = Sify dx dy dz = Vyo, 
ff (z?a — xzy)do = -fff x dx dy dz = — Vxo, 


Jf @z8 — yza)do = 0, 


(xo, yo, Zo are the coordinates of the centroid C). Now we note that the 
components of the force f are 0, 0, — V, and the components of its 
moment with respect to 0 are Vyo, — Vxo, 0. 


8. From the parametric equations 


x = a cos u cos v, y= b sin u cos v, z = c sin V 


o <u < 2r, ~$sv<5 


of the ellipsoid we readily obtain the formulae 


p dS = abc cos v du dv, dS _ D? du dv 
p abccos v 


where 


D? = b2c? cos 2u cos?u + a2c? sin?u cos?u + a?b? sin 2v cos?2v. 


10. 


12. 


13. 


14. 


15. 


Solutions 905 


The integral represents the flat solid angle which the plane z= 0 
subtends at the point M = (0, 0, 1). For a direct analytical proof, use 
plane polar coordinates. 


Verify the identity 
ð /[a—x ô [b—y 0 (c— z\_ 
Al y? } +35 y? EFA y? }=0 
y? = (x — a)? + (y — b}? + (z — c}, 
for all points (x, y, z) different from (a, b, c). From Gauss’s formula in 
three dimensions we conclude (i) that Q = 0 if = is a closed surface 
such that A = (a, b, c) is outside the volume bounded by =; (ii) that if 


A is within È, the value of the integral is independent of the shape of 
x. Taking for È a sphere with center A, we easily see that Q = 4r. 


The integral, writing y forr, 

dQ 3 ([a-—x 0 [b— x 3 [e— z 

0 = fals | dy de +57 (OS | de de + 3(°S4) de dy 
is independent of X and depends only on the boundary T of È, for the 
identity given in the answer to Exercise 12 implies that 


ð | ə fa—-x ô [ə [b—y 3 [ə (e—2z\|_ 
zal 3 }]+ a5 [50 | y? EAFA y? ||=0 
By Stokes’s theorem (p. 611) and the discussion of Chapter 5, pp. 613- 


614, the surface integral expression for ôQ/ða may be expressed as a 
line integral f u dx + v dy + waz along T. Verify that the functions 


z—c —b 
po WS 
Y Y 


u=0, v= 


satisfy the identities 


ðw 9v_ 9 fa—x\ 0u_dw_ 9/(b—y\ æ du_ 4A (c—z 
( y? | | y? ) a ay = bal y? 
Note the following facts: (1) the value of the line integral 0 remains 
unchanged if T is deformed in such a way that T never sweeps over. 
any of the points (—1, 0) or (1, 0) during its deformation; (2) 0 = 2r if 
I‘ is a small circle around (1, 0) oriented counterclockwise; (3) 0 = 2r 
if T is a small circle around (—1, 0) oriented clockwise. 


Think of C as being a rigid circle made of wire and of I as being a 
string. Now deform the string T to a new position I” lying entirely 
within the plane y = 0. The numbers p and n are not changed during 
this deformation, and the first formula now follows directly if Exercise 
14 is applied to the curve I” within the plane y = 0 and the line seg- 
ment —1 < x < 1, y = 0, z = 0 of this plane. The factor 4r (instead of 
27, as in the previous example) results from the solid angle Q increas- 
ing by 47 along a closed path for which p = 1, n = 0. One way of carry- 
ing out the above deformation of T into IT” analytically is as follows. 
Assume that T does not meet the z-axis and let 


"dz Ox ða 


dy dz ða 
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x = y(t) cos ¢(t), y=yt)sn¢@), z=2¢) (© St <2r) 

be the parametric equations of T. Consider now the family of curves 
P(t): x = y(t) cos [4()], y=yt)sin[¢@)], 2= z(t), 

depending on the parameter t, which decreases from t= 1tot=0. 
Note that (1) =T and that I’ = I(0) is a closed curve that lies in 
the plane y = 0. Note also that (for a fixed value of z) each point P 
of I (t) rotates about the z-axis as t varies; hence, the solid angle 
Q that C subtends at P does not vary with t. This implies that Qı — 


Qo will have the same value for T (0) as for T (1) = T. To prove the sec- 
ond formula, note that 


. (PP x dP’ - (dP x dP’) 
=- fL e PP = _ [ [P sara TPT 


16. Take a coordinate system Ox1, Ox2, Ox3, and denote the position vector 
of a variable point on T by x. Then 


— tł 
= 5 |, x X dx 
has the required properties, for 
aex =j (xı dx2 — x2 dxı) 


is the area of the projection of T on the plane Oxıx2. 


17. The two equations u = fz, v = fy can be solved for x and y, since 
d(u, v)/A(x, y) # 0. Let x = a(u, v), y = t(u, v); since Uy = Uz, we have 
(cf. p. 261) xv = yu, Ov = tu. Hence, a function g exists such that 
x = gulu, V), Y = 8u, V). 


18. = YE 
“G2 y?) Vx F y? F’ 


— —XZ 0 
(x? + y?) Vx + y + 22?’ 


Exercises 6.le (p. 671) 
1. With 6 = 0, equation (17c) takes the form 
, b 
“2 — Z 
(i) r c+ r ’ 
where c = 2C/m and b = 2yu. Writing this in the form 
J—— dr _ 
cr+bdt 


and integrating, we obtain if c + 0, 
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(iia) t= k+ ver? + br_ z f(r), 
C C 
where 
7 ar sinh (1 + 2cr/b) for c>0 
(iib) f=) 4 
Ja are sin (1—2cr/b) for c<0Q, 
and if c = 0, 
7 2/3 
(iic) r= (“SP + 2 . 


Returning to the differential equation (i), we determine the inte- 
gration constant c by 


c = to? — b 
ro 

If c < 0, we see that r is bounded, r < —b/c. If ro > 0, r increases to 
this value and then decreases as the orbiting body falls toward the sun. 
If o < 0, the body moves directly toward the sun until collision. 

If c = 0, we observe that the constant of integration k in (iic) is k = 
+ro?2 = b3/2/793, where the plus or minus sign is taken according to 
whether řo is positive or negative. If řo is negative, we again get a solu- 
tion in which the body accelerates into the sun. If 7o is positive, the body 
escapes to infinity but with limiting velocity zero. 

If k > 0 and fro < 0, the body accelerates into collision with the sun 
as before. But if 7o > 0, the body escapes and it can be seen from (i) and 
(111) that it has a positive limiting velocity, namely, 


ro 
. For both the parabola and the hyperbola, the orbit is nonperiodic and 6 
. 0 
is bounded. Consequently, from f o r? dð = hit — to), for t to approach 
0 


co, r also must approach oo. From (17d) we conclude that 6 = 0 as t > o0; 
hence in (17c), from 


lim r262 = (lim r?0) (lim ô) = h lim ô= 0, 


too 
we conclude that lim Fr? = 2C/m. However, from the definition of e, for 


the parabola (e = D C has the value 0 and for the hyperbola (e > 1), a 
positive value. 


. The force is —m/2 grad r?. Hence, by conservation of energy, 


5 m(#2 + 7262) + 5 mr? =C 


and the moment equations, as for any centrally directed force, yield 
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r26 = h. 


We eliminate ¢ from these equations, as we did from the equations (17c) 
and (17d) for planetary motion, to obtain 


dr _ r 2Cr? 
55 = FAVS -hk -rt 


This is easily integrated to give 


a 


r? = —, 
b + sin 20 


where a = 2h? and b = V1 — h?m?/C?. In Cartesian coordinates this 
becomes 

b(x? + y?) + 2xy =a, 
which is the equation of a conic section. 


4. The force is —grad U, where U = — f f(r) dr. As for planetary motion 


we may apply conservation of energy and the moment equation (17d), 
namely, 


5 m(#2 + 7262) — {fo dr=C 


r26 = h. 


We may now proceed in the same way to the desired result. 
5. Apply the result of Exercise 4. 
6. If Œ, n) are the coordinates with respect to the axes of the ellipse, then 


č = a cos ùo = x + ea 
n = b sin o = y 


give the equation of the ellipse and by the law of areas 
h(t — ts) = f” (x5 - 9% dw 
0 Ow dw 
(0) d 
= ab Í (1 — e cos o) do. 
0 


7. The motion takes place in a plane, since p is a central force (proved for 
the case p = 1/r? on pp. 666). Hence, 


ž=—ľp 
r ? 

. YY 

J=— TP 


It follows that 


xy — xy = constant = h, 
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—xx — yy 


Kx +i = —— =p = —Fp. 
Hence, 
ld (eo a2) 
sap tI) = — FP. 
The distance of the tangent from the origin is 
q= [xy — xy| h 
V$ Hý VK? + VR’ 
therefore, 
idh __ dr 
2 dt q? dt 
or 
l d h?_ —p 
2 dr q? , 


which proves the first statement. For the cardioid we have q = r2//2ar. 
. By definition 


X = — 2x — Quy 
(A) oy 

y= — Xy + Qux. 
On differentiating the two equations twice and combining them, we get 
an equation involving x only, 


X + (222 + 4yu2)X% + 4x = 0 
and a corresponding equation involving y only, 
3 + (202 + 4u?) + aty = 0. 


Thus, x and y are linear combinations of exp [tilu + VX + p2)é] (cf. 
Exercise 2, p. 696) or of cos (u +v)? + ut, cos (u — VX? + pt, 
sin(y + 722 + y2)t, sin(u — /22 + ut, with constant coefficients a, b, c, d, 
and a’, b’, c’, d’. From (A) it follows that a’ = —c, V = —d, c’ =a, d’ = 
b. Using the initial conditions x(0) = (0) = »(0) = 0, x(0) =u, we 
obtain the result given. 

. Let (x1, y1), . . . , (Xn, yn) be the attracting particles. Then the resultant 
force at a point (x, y) has the components 

X — 2 me 9 Y = YY ° 
v v(x — xy)? + (Y — yv)? b V(x — xy)? + (y — yw)? 

If we introduce the complex quantities zı = xı + iy1,. .., Zn = 
Xn + lyn, z = x + iy, Z = X + iY, we have 
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where f(z) denotes the polynomial (z — 21)---+ (Z — Zn) and ž the 
complex quantity conjugate to z. The positions of equilibrium cor- 
respond to Z = 0, that is, to the zeros of the polynomial f’(z) of which 
there are n — 1 at most. 

Positions of equilibrium in the particular case: (0, 0), (Va? — 62, 0), 


(— Va? — 6%, 0). 


Exercises 6.2 (p. 682) 


l. (a) y = tan log (c/V1 + x?). 
b) y = cvi Fe. 
2. (a) y = ce’, 
(b) y?(2x? + y?) = e. 
(c) x? — 2cx + y? = 0 (circles). 
(d) arc tan (y/x) + c = log vx? + y? or, in polar coordinates r = e#t¢ 
(logarithmic spirals). 


(e) c + log |x| = arc sin(y/x) — EJET. 


3. If abı — aib # 0, we have 


dy_ a+ by _ a+ bli) 
dë atbiy ait bid(y/é)’ 


which is a homogeneous equation. 
If abi — aib = 0 or aıja = bı/jb = k, then 


da _ dy _ nte) 
Tat op =at bpp 


and the variables are separated. 
4. (a) 4x + 8y + 5 = ce42-8y, 
(b) x =c— ‘ay — 7x) — = log (8y — 7x). 


5. (a) y = ce™sin = + sin x — 1. 

(b) y = (x + 1)"e7 + ©). 

(c) y = cx(x — 1) + x. 

(d) y = 2 x? + cx?, 

(e) n E 
e I= Vp (+x) +I Ex)’ 


6. Introduce 1/y as a new unknown function; the equation then becomes 
homogeneous: 


Solutions 911 


1 1 — exv5 
x vV5/1 Ll, 1 1,° 
cx | 5 575} 9 M 5 
7. With this substitution, the equation becomes 
v = v"g(x)F(x)"-1, 


8. See Exercise 7. Eliminate y through v = xy, y’ = v'|x — v/x? to obtain a 
separable equation; 


E 1 
Y = x(c — log x)’ 


9. Following the idea of the substitution in Exercise 7, seek a function 
f(x) such that v =yf(x) and v = (y + y sin x) f(x). From f’ = yf (x) + 
yf'(x), we have 


f(x) = f(x) sin x; 
whence, 
f(x) = ae~©s z, 


The constant a is irrelevant for our purpose, and we set a = 1. We then 
obtain the separable equation 


vp’ = —eln—l)cos t gin 2x, 


which is easily integrated by separation of variables. The final result is 


(Comt a A e 
y= Epo 7 7 cos x + ke7("—lcos z (n # 1) 
kecos z+(cos 2x)/2 (n = 1). 


Exercises 6.3b (p. 690) 


1. If any linear combination of these were to vanish, say 
cı sin Mx + c2 sin Nex + » » e + ck sin nex = 0, 


then, on multiplication by sin n;(x), where j=1,..., k, and inte- 
gration over [0, z], we would obtain 


T . 
Cj J o sin?njx dx = 0; 


whence c; = 0 for all j. 

2. Use induction. Suppose that a linear relation cigi + » » © + cede = 0 
holds. Divide by e*«* and differentiate (nz + 1) times if Px(x) is of 
degree nx. The degree of the coefficients of the other e%:7 is unchanged, 
so that they remain different from zero. 


3. Multiply both sides of the equation by (1 — n)y~". 
(a) y!=cx+logx+1. 
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3a? 

2x ` 

(c) (yt + a)? = c(x? — 1). 


4. If we put y = yı + u-t, the equation reduces to the linear equation 
u’ — (2Py1+ Q)u=P. 


(b) y3 = cx? + 


y=x— — APU] 
c+ iN x? exp [(1/2)x4] dx 


5. Equate the right sides of the two equations to obtain y = x? and verify 
directly that this is an integral of both equations. 


6. Note that this is equation (a) of Exercise 5 and is therefore a Riccati 
equation with one solution known. Then apply the result of Exercise 4. 
2/3)x3 

Apea [= f(x, ð]. 
c+ f exp [(2/3)x?] dx 
To draw the graphs of the corresponding family of curves, first plot the 
two branches of the curve 

yY +2x—xt=0 , y = £v(x3 — 2)x, 


which divides the plane into two regions where y’ < 0 and one region 
where y > 0. The two infinite branches of this curve are asymptotic to 
the two parabolas y = +x?. Show that all the integral curves are 
asymptotic to these parabolas by proving the two relations 


y = x? — 


f(x, c) = — x? + o(1) as x — +00 (—œ < c< œ) 
and 
f(x, c) = x? + o(1) as x — —oo (c + 0), 
where o (1) denotes a function that tends to zero. 
7. Put 


yi-ys=a, yı—y4a=b, Ya-ys=ec, ye—-ya=d. 
Then 
a’ + Pa(yi + ys) + Qa = 0, 
so that 


Poi + 9) =-Q-=, 
P(yi — y3) = aP 


or 
/ 


2Py=aP- Q- =~. 


Similarly, 


10. 
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2Pyı = bP— R-S. 
Hence, 
d log (a/b) _ Pia — b) = — P(ys — y4), 
dx 
and similarly, 
d log (cld) _ _ ay, 
a ian P(ys — ya); 
by subtraction, 
a/b _ 
log eld ~ constant. 
. Compare the relation 
d log (a/b) _ _ 
a in P(ya — ya), 


in the proof of the preceding example. 


Particular solutions of the special equation are yı = 1/cos x and 
ye = — 1/cos x; 
y= 1 + ce?* 
(1 — ce?*)cos x` 


. The common solution e* of (a) and (b) is obtained by eliminating y” 


from the two equations. 
(a) cie? + cox. 
(b) cie? + cxx. 


The curve satisfies the differential equation 


n (x — J=r 
dy E 


or in polar coordinates, r. 0, with 0 as independent variable, 


nr? — p 
dr aooo? 
cos 0 de r sin 0 
that is, 
dlogr n 
= Se 9 
dé cos 0 + tan 9, 
whence, 


—_ _ [tan(0/2 + 7/4)” _ _ (1 + sin 0)” 
r = a = n gq aM 
cos 9 cos”t1 @ 


(cf. Volume I, pp. 271-272.) 
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Exercises 6.3c (p. 695) 


x . V¥3Xx 
1. (a) y = cie? + coe-(1/2)z cos 3 + cge—(/2)2 gin “3% . 


(b) y = cie™ + coxe* + c3e2*, 
(c) y = cie? + c2xe? + c3x?er. 
(d) y = cie? + coe-* + czev3z + cge-V/ 22. 
(e) Substitute x = et: 
y = C1X + clx. 


2. From the fundamental theorem of algebra, it follows that f(z) may be 
written 


f(z) = (z — a1)"1(z — a2)"2 e e e (z — aF)"k 


(cf. Volume I, p. 286; Volume II, p. 806), where the uv’s are positive 
integers such tha pi + +++ + uk = n and 


f(av) = f'(av) = » + + = fuvD(av) = 0. 
Now 
L(e4*) = f (Ajer, 
On differentiating this relation (uy — 1) times and putting ìà = av in 
the result, we get (cf. Leibnitz’s rule, Volume I, p. 203) 
L(e%v7) = f(ay) eyt — (0) 
L(xe%v") = [f (av) + xf(ay)Je2v7 = 0 
L(x?atv2) = [f (ay) + 2xf' (av) + x®f(ay)Jetvz = 0 
Leartes) = | (BY 1) fav) + ("T Efa) 

peee ț (iy ~ 1) Aav] ett = Q. 


So we have n particular solutions 


emt, xerit, . . , xhi-leait 
er2t, xez, . . , xHe-leagz 
elks, xertk=, wey xk-leake, 


which are linearly independent by Exercise 2, p. 690. 
3. On substituting in the differential equation, we get 


(aobo — 1)P(x) + (aob1 + arbo) P(x) 
+ (adobe + aibi + azb) P” (X) + eee = 0, 
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and this is an identity if aobo = 1, aobi + aibo = 0, ..., from the 
expansion. The second case reduces to the first if we substitute y’ for y. 


. (a) 1/4 + t) = 1 — t£ + tt — . . . ; hence, 


y = P (x) — P” (x) = 3x2 — 5x — 6. 
(b) 1/(t + t3 = (1/t) — 1 + t — t? + e° --; hence, 
2 


_3, —1,3,2 
- (a) y= Ge. (b) y gue 


x2 


y= ay + Sx + A + cie? + ce’. 


2 


. (b) The equation becomes of the form treated in (a) if we multiply it by 


x3. It has the particular solutions u = x? and y = x5; hence, by (a), a 
third solution is given by w = 1 + x?; the general solution is then 


A(i + x?) + Bx? + Cx’. 


Exercises 6.4 (p. 706) 


1. 


(a) x2 + y2 + cx +1=0 (—co < c< œ) and the line x = 0. 

(b) x? + 2y? = œ. 

(c) The differential equation of the family of confocal conics (cf. p. 256) 
is found to be 


2 v2 —_ n2 2 
yz p ČI 
xy 
which is unaltered if y’ is replaced by —1/y’; the family of ellipses 
(—b? < ec < œ) is orthogonal to the family of hyperbolas (—a? < 
c < —b?). 
(d) y = log|tan (x/2)| + c and the vertical lines x = kr (k an integer). 


y —1=0, 


(e) The family of curves (tractrix) 
x — c = +[Va? — y? — a ar cosh (a/y)) 


and the same family reflected in the x-axis. 


. (a) The family of parabolas y = cx?. 


(b) The family of hyperbolas xy = c. 


. (a) y= x?. (b) y= — x + x log (—x), (0 > x> —œ). 
. y = xp + av1 + p? — ap ar sinh p. 


. x =ce pla + 5P 


y =c(p + aje?’ + 5 PP + a) — T (p + a)’. 
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Note that for c= 0 this gives the parabola y = x? — (a2/4). What is 
the geometrical meaning of this result? 


6. (a) y = sin (x + c), singular solutions y = + 1. 


(b) x= + $ (arc sin y + yV =J?) + c. 


= — — 2a arc tan Y 

(c) x= + (v@a yy m — | + c, 
which is a family of cycloids and can be expressed in the parametric 
form x= c +a (ġ — sin ¢), y =a (l1 — cos ¢). Singular solution 
y = 2a. 


y [J 2 
(d) rat f Hoe (-1 <y <1); 


singular solutions y = +1. (The reader should prove that these 
curves are not sine curves. The expression for x can be expressed 
in terms of elliptic integrals of the second kind; see Volume I, pp. 
436 ff. Section 4.1g, Problem 1.) 

T. y = x sin ax; singular solutions y = x and y = —x. 

8. In each case, let the equation of the tangent line be given in the form 

xla + y/b = 1. 

(a) Clairaut equation, y = xp + kp/(p — 1), where k = a + b. The singular 
integral is the parabola x? — 2xy + y? — 2kx — 2ky + k? = 0 sym- 
metric about the line x = y and tangent to the x- and y-axes at the 
points (k, 0) and (0, k), respectively. 

(b) Set a= k cos 0 and b = k sin 9, where k is the intercepted length 
on the tangent, and use 9 as the parameter along the curve. The 
Clairaut equation is y = xp + kp//1 + p?. The parametric equations 
of the curve are x = k cos? 0, y = k sin? 0. This is the astroid of 
Volume I, p. 436, Section 4. le, Problem 7. 

(c) Set |ab|= k. The Clairaut equation is y = xp + Vk|p|. The curve 
is the union of two rectangular hyperbolas 4xy = + k. 


Exercises 6.5 (p. 710) 
1. (a) Rewrite as (4y’2)' = x; 
y = 5x va? Pa + 5a log (x + Va? +a). 
(b) Rewrite as (y’’?)’ = 1; 
y = = (x + a)? + bx + c. 


(c) Rewrite as (xy’)’ = 2; 
y = 2x + a log x + b. 
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(d) Rewrite as x (y”2Y = y”2 — 2 and introduce y’”? as a new independent 


variable. y = x? + ax? +bx +c. 


2. (a) y = (ax + b)”. 
(b) y = Va + (x + b} 
(c) y = Val(x + 6)? + a~l. 


(d) The equation can be expressed in the form p(d/dy) (ply) = 1. y = 
a/(1 — be%*). Note solutions p = 0, y = constant. 


(e) Introduce new variables z and q, where z = y”, q = y” and q(dq/dz) 
= yi, 


y=ax + bx +et i (5+) 
15 \2 


(£) Proceed as in part (e): 
y=ax+6+csin(x+d). 


3. MN = y1 + y2, MC = — [(1 + y’2)32/y”], and the differential equation 
is 


(1 + y?y + ky” = 0. 
By the general method this is easily reduced to 


2 — v2 
(2) ~kte-y (c an arbitrary constant). 
dx y2— ce 

The various cases, all of importance in the differential geometry of 

surfaces, ! are as follows: 

(1) k= x%> 0), c= —y? (<0, y2< x2). The curve is everywhere 
smooth and oscillates, alternately touching the lines y = vx? — y?. 
It looks like a sine curve, but is not one. 

(2) k = x? , c = 0. The curve is a circle of radius x with center on the 
x-axis. 

(3) k=x?,c=y?(> 0). The curve consists of a sequence of identical 
arcs, joined by cusps lying on the line y = y, and all touched by 
Y = Vx? + 72. It looks like a cycloid but is not one. 

(4) k = —x?(< 0), c = y? > x2. The curve consists of a sequence of 
identical arcs upside-down, with their cusps on y = y and touched 
by y = Vy? — x2, 

(5) k = —x?, c = y? = x2. The curve is a tractrix. 

(6) k = —x?, c = y? < x2. The curve has an infinity of cusps perpendic- 
ular to the lines y = y and y = —y alternately. 

4, Eliminate a, b, c by using the equations obtained by differentiating the 
equation of the circle three times successively. 


1See L. P. Eisenhart, A Treatise on the Differential Geometry of Curves and Surfaces, 
reprinted by Dover (N.Y., 1960), pp. 270-274. 
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(1 + y?) y” — 3y' y”? = 0. 


Exercises 6.6 (p. 713) 


1. (a) preasenn E (v= 2). 
(b) co = z = 1, cev = 0, cev+1 = a n (v= 1). 


(c) c= 0,01 =1,c2=0,c0=5. 
x2? x3 
(d) ltet tyt 


2. If y(x) = Levxy, then 


Cv+2 = -ppi and co = 1, & = 0; 
yx) = X CD 


v= HT 


If we substitute the power series for cos xt in the expression for Jo (x) 
in Exercise 7, p. 475, and interchange summation and integration (Why 
is this permissible?), we a 


vy 
Je) == FSi (DY eat 
the value of 
+1 2 . (2v)! x 
f, Jp% is ypz 


as is found by putting t = sin t and referring to Volume I, p. 280. The 
power series for y(x) and Jo(x) are therefore identical. 


Exercises 6.7 (p. 726) 


1. Poisson’s formula gives a potential function u(r, 9) inside the unit 
circle, with boundary values f(®). Now u(1/r, ®) is also a potential 
function (cf. p. 58, Exercise 4) with the same boundary values, and it is 
bounded in the rogton outside the unit coles thus, the expression 


al — 2r a —a)+r? 
is a solution of the S m. 
2. The potential is 
z+l+ vetet 
u log —— a, 
z—l+ ve F xF y? 
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Since on the ellipsoid z = læ cos ġ, Vx? + y2 = IVa? — 1 sin ¢, the 
potential is 


u log H, 
«x — 1 


the confocal ellipsoids 


E LAA TY <a < 

PAT pey Q Saso) 
are equipotential surfaces. The lines of force are the orthogonal traj- 
ectories and hence (cf. Exercise 1.c. p. 707) are the confocal hyperbolas 
given by the same equation when 0 <a <1 and the ratio of x to y is 
constant. 

3. Let >> be a sphere of radius pe and center (x, y, z), lying inside S. Since 
A(1/r) = 0 and Au = 0 in the region bounded by }, and S, by Green’s 
theorem (cf. p. 608) we have 


1 3u A(Ur)\ oo ( ðu _ Ar) 
0= I, H ðn “ân | de fh r an an Sy) do 
where in the first integral n is the outward normal to S and in the 


O(1/r) _ 
ôn 


second the outward normal to >>. Now on the sphere >; we have 


ð 
(1/r) = — i , r = constant = 9; therefore, 
or e 


Ser an®= fen. 


since u is a harmonic function (cf. p. 720); in addition, 


-i [fuged =o [fu do, 


and as pọ — 0, this expression obviously tends to u(x, y, z), for it is the 
mean value of u on È. 


Exercises 6.8 (p. 734) 


1. (a) u = f(x) + g(y); f and g are arbitrary functions. 
(b) u = f(x, y) + g(x, z) + h(y, z); f, g, h are arbitrary functions. 


(c) The most general solution is obtained from a particular solution 
by adding the general solution of the homogeneous equation uzy = 
0. 


u = fp E fa, 2) dn + fe) + 80), 


where f and g are arbitrary. 
2. If u(x, y) = Do awxvyr, then 
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Ovi, pti = Iu ___ ; 
“+1 +1) 
in addition, 

avo = aov = 0 


for v 2 1 and «oo = 1. Hence, 


u(x, y) = Z ET = Ii Vxy), 
v=0 y! 


where Jo is the Bessel function of Exercise 2, p. 713. 

3. 2?(zz? + z2 +1)=1. 

4. A one-parameter family is obtained from the two-parameter family of 
solutions z = u(x, y, a, b) by making a and b depend in some way on a 
parameter t: 


a = f(t) 
b = g(t), 
z= u(x, y, f(t), g(t)). 


The envelope of this one-parameter family is obtained by finding t from 
the equation 


0 = 2: = Uaf’ + ung’, 


and substituting this expression for t in z = u(x, y, f(t), g(t)). The 
result is again a solution of F(x, y, Z, Zz, Zy) = 0, as 


z = u(x, y, a, b) 
Zz = Ur + Utter = = Uz(x, y, a, b) 
Zy = Uy + Utty = U(x, y, a, b) 


and z = u(x, y, a, b) satisfies the equation F(x, Y, Z, Zz, Zy) = 0. 
5. (a) From the differential equation we get 


FOP + Iio = 1 
or 
(f(x) = 1 — igy). 


As the left-hand side does not depend on y, nor the right-hand side 
on x, both sides are equal to a constant (which has to be positive or 
zero), say c?; that is, 


(f(x)? = e, 1 [g'e = e. 
Hence, 
u=cx+V1—cyt+b 
is a solution, where c and b are arbitrary and c? <1. 


(b) u = f(x) + g(y) gives 
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= constant = a, 


ee! 
fx) = g'(y) 
so that 
— 1 
u =ax + z y+b 


(where a and b are constants). 
If u = f(x) g(y), then 


L Ifi yy? = als, [g(y)]? = constant = 2c; 
so, in this case, 
u = Jex +a)("y + b), 


where a, b, c are arbitrary constants. 


O u=zy +k sti FER 
. Apply the linear sransformation 
x=€E+, 
y = 3% + 2r, 


u = f(y — 2x) + g(3x — y) + a erv, 


. Put u = (x2 + y2 + z?)"/2 and let K be of degree h. Then, 
Au = Urz + Uyy + Uzz = n(n + 1) (x? + y? + Z2)(n—-2)/2 


ety ae +255 = hK 
Oz 


(cf. p. 120). Hence, u = (x? + y? + z?)-(+”)/2 is a solution. 
. According to p. 728, a solution of the first equation is of the form 
z = f(x + at) + g(x — at). 
On substituting this expression in the second equation, we have 
f'g’ = 0; 
that is, either f = constant or g = constant. Hence, z = f(x + at) or 
z = f (x — at) is the most general solution of both equations. 


. (a) From the differential equation 


Gra _ ae þpe_ —3 
$ ep ” 
a constant. The boundary conditions can be satisfied only if à = — n?, 


where n is an integer and 


(x) = « sin nx, 
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whence, 
b(t) = a sin nct + b cos net. 
Thus, the most general particular solution of the specified type is 
u(x, t) = sin nx (a sin nct + b cos net). 
(b) Using sin A sin B = 4 [cos (A — B) — cos (A + B)] and sin A cos B 
= + [sin(A + B) + sin(A — B)], we obtain 


u(x, t) = (a cos n(x — ct)+ b sin n(x — ct)] 
— zla cos n(x + ct)— b sin n(x + ct)]. 


(c) Assume a solution in the form of a sum of solutions of the type 
obtained in part (a), that is, 


u(x, t) = }, sin nx(an sin net + bn cos nct). 
n=1 
In order to satisfy the initial conditions in (ii), we must have 


bn = An, an = 0. 
For the solution of (i), observe from Volume I, p. 587, (17), that 


on = ae —f(—x) sin nx dx + f f(x) sin nx dx 


= ac sin nx dx. 


For the particular function in Oe we find «zv =0, aes = 
(—1)v/x(2v + 1)?, where v = 0, 1, 2, 


whence 
u(x, t) = 1/sin x cos ct _ sin 3x cos 3ct 
? x 12 32 
sin 5x cos dct | 
sin Gx con Bet 


10. u(x, t) = f(x — at) + g(x + at); then, for x = 0, 
0 = u(x, 0) = f(x) + g(x) 
0 = u(x, 0) = —af’(x) + ag’(x); 


by differentiating the first equation and comparing with the second, 
we have 


f(x) =0, = g(x) = 0, 
or 
f(x) = constant = c, g(x)=-c for x= 0. 


For t = 0, moreover, 
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d(t) = u(0, t) = f(—at) + glat) = f(—at) — c; 


that is, f (€) = c + ¢(E/—a) if £ < 0. As x + at 2 0 always, and, hence, 
g(x + at) = —c, it follows that 


0 for x — at = 0 


u(x, t) = yar for x— at <0 


if both x and t are nonnegative. 


Exercises 7.2a (p. 743) 


2 |(x1 — xo)? + (yı — Yo)? 


1. -~-= 
V2g yı — yo 


2. T= f f(r) Vp2 + 7262 + r2 gin20¢2 do. 


Exercises 7.2d (p. 751) 


x2 
4c2° 


(b) Circle with center on x-axis. 


1. (a) Parabolas y = c? + 


— a 


(c) y= csin ~ 


2. y = a- bforn >l, and y = a log x + b for n = 1. 


~~ xn-l 


3. y = a(x — b)™"+*™ ifn +m + 0; y = ae? if n = —m. 


4. ay’ + a'y + (b’ — c) y = 0; for b = constant, 
71 / — bin — y2 
J byy dx = 5 (ya? — y?) 


only depends on the end points of the curve y = y(x). 


7 
5° 


6. Consider F (x, y) for fixed x as a function of y; let this function of y have 
a minimum for y = y. Then, F(x,y) 2 F(x, ¥) for a certain neighborhood 
of ý and F,(x, y) = 0. y will depend on the parameter x; [i.e., ý = y (x)]. 
Then, for any neighboring function y, we have 


[T Fæ, x(x) dx = [7 F(x, 3) dx, 
To To 


5. yı — Yo < 


where ¥ (x) satisfies the equation F(x, »(x)) = 0. 
7. (a) y =0. 
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(b) Use Cauchy’s inequality. For any admissible x, 
— — 1 / 1 12 1 /2 = 
1=y0)—y0) = | y ds s/f" 1ds f" y? dx = VI 


and the equality sign holds for y = x. 


8. Introduce 1/r as new dependent variable in Euler’s equation. The general 
solution is the line 1/r = a cos 0 + b sin 9. 


Exercises 7.3b (p. 757) 


1. If v = 1/f(r), then T is given by Exercise 2, p. 743: 
F = f (r) V}2 + 7262 + r2 sin? 6 2. 
Euler’s equation for the variable ¢ gives 


i F22 ain? 
F; = srian g = constant = C 


along a ray. Now let the polar coordinates be chosen in such a way that 
the plane ¢ = 0 passes through the initial point and the end point; since 
¢ = 0 at both these points, we have ¢ = 0 for some intermediate point, 
by the mean value theorem, that is, C = 0; but then ¢ = 0 for the whole 
ray, that is, ¢ = 0. Hence the whole ray must lie in the plane ¢ = 0. 


2. See Exercisel.Using¢ as parameter, wehavetominimizer fv 62 + sin26 dé, 
where r = constant. Introducing cot 0 as new dependent variable in 
Euler’s equation leads to the general solution cot 90 = a cos ġ + 
b sin ¢, corresponding to a curve of intersection of the sphere with a 
plane through the center. 

3. See Exercise 1 above. Here in spherical coordinates we have 8 = con- 
stant. Introducing r as dependent and ¢ sin 9 as independent variable 
ylelds the same integral to be minimized as in Exercise 8, p. 752. (The 
mapping of the point of the cone with spherical coordinates r, 0, ¢ 
onto the point in the plane with polar coordinates r, ¢ sin 9 preserves 
arc length). 


1/r = a cos(¢ sin 9) + b sin(¢ sin 9). 


4. The path has to be straight, since it has to have minimum length for 
given end points. We only have to find the minimum distance between 
two points constrained to move on two given curves, which is a minimum 
problem for a function of several variables with subsidiary conditions 
(cf. Chapter 3, p. 337). 


5. See solution to next problem. 


6. Let the end points be constrained to lie on the curves y = f(x) and 
y = g(x), respectively. Let the minimizing curve have end points (ao, 
f(ao)), (bo, g(bo)), and an equation y = u(x), where u(ao) = f (ao), u(bo) 
g(bo). Since u also is an extremal for fixed end points, it satisfies Euler’s 
equation. Consider a family of curves y = u(x) + n(x) with parameter 
e and end points (a, f(a)), (b, g(b)), where a = a(c), b = b(e) are solu- 
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tions of f(a)= u(a) + en(a), g(b) = u(b) + en(b). The corresponding in- 
tegral is 
0(s) 


GO = Jr F (æ w(x) + ene) VIF WG) + OE de. 


For the extremal u we have the condition 0 = G’(0). We evaluate G’(0) 
as on pp. 743-744, using integration by parts to eliminate 7/(x). 
Because u satisfies Euler’s equation the only contributions arise from 
differentiating the limits in the integral for G and from the boundary 
terms in the integration by parts. Noticing that, for « = 0, 


Ifa) — w'(ay) FE = x(a), Lg) — WOE = 000) 


and that 7 (a), 7(b) are arbitrary, we find the relations 
0 = 1 + w(ao) f’ (ao) = 1 + u'(bo) g’(bo) 


expressing orthogonality at the end points. 


Exercises 7.4a (p. 765) 


1. The law of conservation of energy gives 
1 /ds\? 
T = T = =| 
+U alas 
hence, ds/dt = constant = C = initial velocity. 
Then Hamilton’s principle asserts the stationary character of 


fè @- 0) dt= J" Ta=3e ft dt=30 fds: 
0 0 ° ro 


the stationary character of Hamilton’s integral implies. that the length 
of path is stationary. 


2. Let t be a parameter along the curve C. On the geodesic perpendicular to 
C at a point of C with parameter t, we use arc length s as parameter, 
counting s from the point on C. Then x = x (s, t), y = y (s, t), z = z (s, t) 
shall represent the curve obtained by laying off a fixed geodesic distance 
s along each geodesic perpendicular to C at the point with parameter t. 
Here, since s is arc length, we have xs? + ys? + zs2 = 1; moreover, by 
formula (19), p. 765, Xss, Yss, Zss are proportional to Gz, Gy, Gz, and 
G(x, y, z) = 0 for all s, t in question. On C(i.e., for s = 0) we have by 
assumption xsX: + ysyt + Zz: = 0. Then, 


= constant = 50; 


z (XsXt + Ysyt + Zs2t) = A(GrXt + Gyyt + Gz2t) + XsXst + VsVst + ZsZst 


dG ld 
— 7, OV 1% (x2 2 2) — 0. 
dt T3 g + Ys + Zs?) 0 
Hence, xsxt + ysyt + 252 = constant = 0 for all s, which proves that 
the curves C’ for which s = constant are perpendicular to the geodesics. 
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Exercises 7.4b (p. 767) 


1. 


From the differential equations for geodesics (p. 765) we find that for a 
cylinder (i.e., if G does not depend on z) dz/dt is constant; hence, the 
geodesics on a cylinder make a constant angle with the x, y -plane. 


a 

6y’(y"2 + Ay’ y’”) 2y”” 48y'2y"3 
b — o ot = 0. 
( ) g(x) al + y’2)4 + (1 + y'2)3 (1 + 2) 


(c) y+ y” + y” =Q. 
(d) (2 — y?) y” = 0. 


. (a) od = (az + by)bz + (bz + Cy)by + abzz + 2b$zry + chy. 


(b) 42% = 0. 

(c) Ad = 0. 

au” + atu’ + ub —c) _ à = constant. 
u 


. (a) Euler’s equation gives 


f + 2u = 0; 
from this equation and fi $2? dx = K?, we have 
J fof? dex +Kf 
A= + — K ’ u = -~i ° 
J J, f? dx 
(b) For any continuous admissible ¢ we have 

vl... fel... wl. 

I= Jf dx < fff ax J fe ax =K | fP dx, 
the equality sign holding for ¢ = u. 


. From the necessary condition (6b), p. 742, we find that 


f i (F yyn? + 2F yy NN + Eyy?) dx >0 


for any n(x) vanishing at x = xo, xı. Let h and & be such that xo < E — h 
<&<&+h<-x1. Define n(x) to be [(x — E} — RPh- for |x —El< 
h, and tobe 0 elsewhere. For h—0,the integral tends tocFy’,’(é, u(&), u’(é)), 
where c is a positive constant. 


. Problem really identical to standard isoperimetric problem. Solution is a 


circular arc, but since solutions are functions of x, there is an upper 
bound on permissible lengths in this problem, namely, 


2[(x1 — xo)? + (yı — yo)*] arc tan 2T 2o. 
x1 — xo | yı — yol 
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Exercises 8.1 (p. 777) 


1. 


or 


(a) Set « = a1 + iaz, B = bı + ibe. 
For the example of multiplication, 
a8 = (aibi — a2b2) — i(aibe + a2b1) = aß- 


(b) Follows directly from part (a) on passage to the limit of the real and 
imaginary parts of the partial sums. 


. (a) From Exercise 1, P(«) = P(ā); hence, P(«) = 0 implies P(@) = 0, and 


conversely. 
(b) By long division express P(z) in the form 


P(z) = (z? — 2az + a? + b?) Q(z) + cz + d, 


where Q(z) is a polynomial with real coefficients and c and d are 
real. Setting z = « in this equation, obtain ca + d = 0; whence, 


ca+d=0 and icb = 0. 
Since b # 0, c = 0, and hence, d = 0. 


. (a) Use the equation of a circle in the form 


(z — zo) (2 — žo) = r?. 
Then zo = « — 228, r2 = Zožo — «āū + A288. 


If à = 1, z = x + iy, the equation becomes that of a straight line, 
ax + by = c, where a = 2Re a, b = 2Im ß, c = |a|? — |B|2. 


(b) Invert the transformation to obtain 


— 62’ 
z=% = ; 
yz — a 


then show that 
|z — zı| = A| z — 22| 
becomes 


yz — a 


- |z — zř|. 
Yz2 — «& 


|2’ — 21'| =r 


. For x = 0. 
. Use the comparison test. 
. The coefficient of z” in the expansion of cos?z + sin?z for n > Qis 


(—1)"”2 x" (Dy = (—1)"” 5 (—1("| =—0 


veovin—v)! n! v 


[cf. Volume I, p. 110, Exercise 1 (b)]. 


. The series is convergent if, and only if, |z| < 1, for if |z| = 9 < 1, then 


QV 1 


= 9’ 


=1—0 "1-6 


FAJ 
1 — zy 
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and we may compare with the geometric series. If |z| > 1, then zy /(1 — zY) 
tends to —1 as v increases, whereas in a convergent series the terms 
must tend to 0. If |z|= 1, each term of the series either is undefined or 
has absolute value 2 4 and the series cannot converge. 


Exercises 8.2 (p. 786) 


1. 


Set f(z) = u + iv, g(z)= s + it. Taking the product, for example, we 
find for 


U(x, y) = Re {f(z) g(z)} = us — vt 
V(x, y) = Im {f(z) g(z)} = ut + vs 


that 
Uz = US + usz — (Uzt + vtz) 
= VyS + Uty + Uyt +USy 
= Uty + Uyt + Vys + USy = Vy, 
and so on. 


. For f(z) = u + iv, on differentiating u? + v? = constant, we obtain the 


pair of equations 
UUr + Uz = 0, Ully + VVy = Q. 


Replacing the second equation through the Cauchy-Riemann equations 
by one in derivatives with respect to x alone, we obtain a system with 
only the solution uz = vz = 0 (unless we are dealing with the trivial 
case u2 = v? = 0). Consequently, uy = vy = 0 and the result follows. 


. (a) —(c) Everywhere continuous; not differentiable. 


(d) Continuous for z + 0: not differentiable. 


. If z = rets, ¢ = & + in, then 


=r +3) cos ¢ 


_1/,_ li. 
n= zl” 3) sin ģ. 
If r = constant = c, then 
2 2 


ke+io He — 1/0} 


if ọ = constant = c, then 


2 2 
E x = 
cos? c cos?c—l1 


(cf. p. 256, Exercise 8). 


. From 8.1, Exercise 3b we know that the transformation maps circles 


into circles. Since the two points are fixed, circles through them map into 
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circles of the same family in both the transformation and its inverse. 
Since the mapping is conformal, the same is true of the orthogonal family 
of circles. 


6. Set z = x + iy, C=1/z=£& + in. Thus, 


and we recognize inversion as the composition gf (z) of 1/z and reflection 
in the x-axis, g(t) = č. Since reflection is conformal—with reversal of 
the sense of angles—and 1/z is analytic, inversion is conformal. Re- 
flection maps circles into circles, and 1/z, a general linear transformation 
(see Exercise 5), does the same; hence, inversion does the same. The 
Jacobian of inversion is the product of those for reflection and for 1/z, 
hence, for inversion it is 
1 —1 
—|f’ 2 = — n= 
7. gje = = 22 + BB + («Bz + az) 
BBzz + a& + («ßz + āßz) 
Now for «a — 88 = 1 the difference between the numerator and the de- 
nominator is 


zz—1; 


so the numerator is greater than the denominator for |z|> 1, and 
smaller for |z|< 1. If 88 — «ā = 1, the converse is the case. 


8. First transform, by putting ¢=az+ b, into the unit circle; then 
apply the transformation 


— g, — 8 — BY) Gi — 2) 
9. Use i — U = tiz + 8) (yz) F 8) ` 


Exercises 8.3 (p. 796) 
1. (a) Write the integrand in the form 


1/ 1 3 
ssa ta) 


The first term in parentheses is analytic in the neighborhood of 
z = —1; hence, its integral around a small circle centered at —1 is 
0. Similarly, the integral of the second term around a small circle 
centered at 1 is 0. To evaluate the integral in the circle about 1, 
set z = re‘® to obtain zi. Similarly, for the small circle about —1, the 
integral is 3771. 

(b) Take a path circling 1 in one sense three times as many times as it 
circles —1 in the other; for example, (see Fig. 8.12). 
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Figure 8.12 


2. azaS = exp[z(log « + 2nzi)] exp [C(log « + 2mzi)], 
whereas 
att = exp[(z + ©) (log « + 2kri)]. 


Thus, addition of exponents is valid, provided the same branch of the 
logarithm is used throughout; that is, n = m = k. Note that this is the 
best one can do except in very special cases, for if the addition theorem 
is valid, then 


kK(zg+0=nz+me+p, 


where p is some integer. If z and ¢ are linearly independent when 
considered as two-component vectors and n + m, the components of 
z=a+iband¢=«+ if are restricted by 


(n — m) (aß — «b) 
B+6 
an integer, and if n=mz#k, then 8+ b=0. Neither condition is 


generally satisfied. 
For the second law, 


= Pp, 


zaķa = exp [a(log z + 2nri)] exp [«(log ¢ + 2mzi)] 
= exp {a[log z + log ¢ + 2(n + mri}, 
whereas 
(20)* = exp {x[log(zt) + 2kri]} . 


Here, equality need not even hold if k= n + m because if z = ret? 
and = oe’, the conditions —m < 0 <x, —r < ġ <7 donot force 0 + ¢ 
to satisfy the same inequalities. 

For the third law, 


(a2)6 = ef log a? — exp {C[z(log « + 2nri) + 2mri]} 
= exp (20 log a + 2zťnri + 2¢mri). 
Similarly, 


(aS)? = exp (2 log « + 2zýpri + 2zqni) 


5. 
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and 
at = exp(zt log « + 2zčrri), 


where m, n, p, q, r are arbitrary integers. Thus, we generally expect 
equality to hold only ifm = q = 0 and n = p =r. 

The best one can say is that it is possible to pick branches of the 
many-valued functions involved so that the laws of exponents hold, but 
we must be cautious about choosing them properly. 


. (a) The values of i are exp [(2n — al, for integral n. 


(b) Set € = & + in, z = rete, —r < 0 < v and a= log r= log!z|. Then, 
zt = exp[a& — (0 + 2kr)n] exp {ilan + &(0 + 2kr)}}. 


The condition is that ayn + &(0 + 2kz) be an integral multiple of x for 
each choice of integral k. Setting k = 0, 1, we obtain the condition 
€ = j/2, where j is any integer and, hence, for a + 0 (r + 1), 


n = (Ix — 5i0/a, 


where l may be any integer. Thus, for any z not on the unit circle, 
there exists an exponent (j, D) for each pair of integers j, L such that 
all values of 26 are real. If a=0, the foregoing condition on 7 above 
is replaced by the condition 9 = pz, where p may be any integer, 
and 7 is now arbitrary. If p #0, we see that 0 = 2xp/j must be a 
rational multiple of 2x. If p = 0, § may be zero and then 9 may be 
arbitrary. 

(c) Yes. Set z = x + iy, C=&+in, where y=7=0. Ifx>0, the 
solution of part (b) yields & = je, where j is any integer. If x < 0, 
part (b) yields only integral values of § = n. 


. For z = x + iy, we may certainly differentiate under the integral sign 


with respect to x and y, since these derivatives are continuous with 
respect to the parameters and convergence of the integrals of the 
derivatives at the lower limit t = 0 is uniform for x > e > 0. Since the 
Cauchy-Riemann equations hold for the integrand, they must then hold 
for the integral. Integration by parts yields the functional equation. 


Use the theorem in Volume I, p. 525, to show that the series is absolutely 
convergent. 


6. (a) The value of the integral round the small circular detour tends to 


zero as the circle becomes smaller. If we put z = e? on the unit 
circle and z = x, z = iy, respectively, on the axes, Cauchy’s theorem 
gives 


1 1 \m (ni2 . . 
0 =Í (x + 5 x”-1 dx + i f (ef + e-i0)m ein? dO 
0 Xx 0 


|. 1\" OG 
i f [iv + =| (iy)""1 dy 
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1 1 m . 7/2 
=| x + 5] xli dx+i» am | cos™ô eti”? dé 
0 


— etT(n-m)/2 f — y + 1 ” yn- dy; 
0 y ’ 


by equating the imaginary penis of this equation, we get 


2m f, m2 cos™® cos nô dé = sin ZP ain m f -7 +5)" y"-1 dy 
— 1, n=) — (n-m-2)/2 
=35sn -5 fa nym yin—m—2)/2 dy 
1 n—m 
=5 (sinj (n — m)} B{m +1, 9 | 
(cf. p. 508). 
(b) Use the relation 
. oom r(2>5") = T 
[sin 2 2 | I-n m] 
(cf. p. 508). 


Exercises 8.4 (p. 805) 


1. The integrand has a continuous derivative with respect to z; conse- 
quently, differentiation under the integral sign is permissible. See 
Section 1.8b. 
. It is easily seen that 

hie) = 2. [TO] 

(z) = epin C 

is an analytic function of z. a differentiating under the integral sign 
and using Leibnitz’s rule (cf. Volume I, p. 203), we find that h (z) is 


1S (Hine —leee(n— FG) gey 
oi e) vine(a—Dere(n—aty +) f e n t 
_vl #f on FO) zty 
T Oni Zl Cey p 


Only the terms with u — v <n differ from zero, as otherwise l m ' 


vanishes. On the other hand, a term with u — v < n vanishes for z = 0; 
if u < n, there are no other terms, so that h (0) = 0. If u = n, there 
remains only the term with u — v = n, so that 


hwo) = 2 ef e -AO __ ar — fw). 


z)”+1 


. By the Cauchy-Riemann equations the partial derivatives vz and vy of 
v are given; a function v with these derivatives does exist, since the 
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condition of integrability uzz + Uyy = 0 is satisfied [see p. 104. formulae 
(75a,b)]; v is uniquely determined apart from an additive constant 
c and is given by the curvilinear integral 


v(x, y) = fZ” (vy dy + vz dx) + c. 


(20,40) 
It also follows from the Cauchy-Riemann equations that v is a potential 
function. 
4. At z = 1, vi; at z = —1, 3mi (Section 8.3, Exercise 1). 
5. Choose a circle of radius R centered at 0, with R = |¢| so large that 
R > 2|z|. Then, 
1 lf _ lkl | 
C—-z Ct} KE- zk] R 
Consequently, for the integral, obtain the bound 
IF) — fO)|S 2M|z|/R. 
Pass to the limit as R tends to ©. 
fO, | 1 M 


1 
| Ata) Cc ¿yti = 2r ovtl 


6. lav|= 2mo, 


where C is the circle of radius ọ about the origin. 
7. By assumption |an|> 0. Consequently, 


Ln—1 


(i) |P(z)|=|z|” 


An + 


E a 
an 


> Slzl*lenl, 
provided we take 


|z|> max | 2 ani setlo, 


[æn] 
for, then, 
an + B+ eee > Jan|— feel a eee 120] 
z z |z| |2” | 
> jan] — Enlt . ee tool. leal 
= |an| zl > 


Now, since P(z) has no roots, f(z) is defined everywhere. But, since 
|z|> 1, 
2 2 

f(z)|< ——— < ——.. 

FONS Tane] ~ [an 
Consequently, f(z) is bounded and therefore constant. We conclude 
from the first of the foregoing inequalities that f(z) = 0, which con- 
tradicts f(z) P(z) = 1. 

8. (a)-(b) The residue of f’/f at « is 2niI. Set f(z) = (z — «)? ġ(z), where 
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10. 


11. 


¢ is analytic, ¢(«) + 0, and p represents either the order n of the 
zero or —m for the pole for parts (a) and (b), respectively. Then 
f(z) _ p$) + (2 — &) $e 
f(z) (z — «) $(2) 
Cauchy’s integral formula then shows that I is the value of [ p¢(z) 
+ (z — «)] ¢’(z)/¢(z) when z = «; that is p. 
(c) Apply the theorem of residues (p. 805). 


. (a) The number of roots of the equation P(z) + 9Q(z) = 0, by Exercise 


8, is 


1 [PORO 
ani Jc Ple) + 0G) | 


The denominator differs from zero for every 9 for which 0 < 9 < 1 at 
any point of C; the whole integral is therefore a continuous 
function of 9. As its value is always an integer, it is constant and, 
hence, the same for 0 = 0 and 6 = 1. 


(b) If 
laj<rt—2, 
r 


then r > 1; so the equation zë + 1 = 0 has five roots inside the 
circle |z|= r; if we put P (z) = z5 + 1, Q (z) = az, we have on the 
circle |z|= r, 


|Q(z)| =l|a|r< rF — 1 <|z5 + 1|=|P(2)|. 


From the lower bound (i) in Exercise 7 for |P(z)|, no root can lie 
outside or on a sufficiently large circle about 0. Applying the technique 
of estimation used in (i) in Exercise 7, we find 


f(z) _n 

ary = — + RZ), 
f(z) z @) 

where the remainder R(z) satisfies | R(z)|< M/|z|? outside a circle 

of sufficiently large radius r. Take r so large that all the roots of P lie in 

its interior. Applying the result of Exercise 8(c), we obtain for the 

number of roots, the integral about the circle of radius r 


1 fz) 4, 1 
oo eo dz=n+5- | R(z) dz. 


Since 


1 M 
ling JRO ae <P> 


the remainder integral tends to zero as r — oo, 
(a) Follow the method of solution for Exercise 8(a). 


(b) If the roots are «1, «2,..., «j, if the poles are located at ßı, $e, 
. , Bk, and if these have multiplicities nı, ne,..., n; and mı, 
me,..., Mx, respectively, the integral has the value 
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nia + nzx2 + » » e + nja; — Mıßı — M2ß2 — » e © — mxBx. 


12. Since f(z) = e? is everywhere analytic, since f’(z)/f(z) = 1, and since 
the integral J of Exercise 8(a) must therefore vanish on any circle, 
no matter how large, f(z) can have no roots. 


Exercises 8.5 (p. 814) 


1. (a) Expressing the functions in the neighborhood of « by 
f(z) = ao + a(z — a) + eee + an-1(2 — a)l + 06. 
and 
g(z) = (2 — a)” [c-n + C-n+1(2 — a) + ° e+ + c-i(z — aM 1 + o o o], 
we obtain the residue 
. nail 
27i J}. AyC_y_}. 
v=0 
(b) In the foregoing solution, use cx = 0 for k > —n and an-ı = 
fe-D(a)/(n — 1)! 
2. Set 


fle) = e-a ga EP EO @— a) eee] 


and determine the first-order coefficient in the expansion of 1/¢(z). 
3. (a) 7/72. 

(b) Use the result of Exercise 2 for the residues at e‘™/4 and e3'*/4 to 
obtain 3x/4/2, Here, for f(z) = (1 + x4)?, f” (z) = 24x2(1 + x4) + 32x68 
and f’’(z) = 48x(1 + x4) + 9°32x5. 

(c) The integrand has simple poles at the points zk = w2*-1 (k = 1, 
2, ...,2n), where œ = e?"2” is the principal (4n)-th root of unity. 
For k < n, the poles are in the upper half-plane. Thus, from formula 
(8.21b) the integral is equal to 


. A zZ 2m TL n 
k=1 2nzz?"- n k=1 
where we have used zx?” =—1. Entering the expression for zx in this 


last sum, we obtain J in the. form of a geometric series and then 
sum to obtain the result: 


I=— Ooo > [o4m+2]k _ TIwemt 1—(w4m+2)n 
~ nomh po n ] — wimt2 
T 21 T 


nom — o mD ~ 7 sin[(2m + 12n] ` 
4. The left-hand side of the formula is the sum of the residues of the function 
z*/f(z) divided by 2ri and is therefore equal to 
1 zk 
anid fle)” 
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round a circle enclosing all the roots «y. But this integral tends to zero 
as the radius of the circle tends to infinity (the center remaining fixed). 


5. Because x cos x is odd and x sin x is even, the integral is equal to 


1/* xe® dx 
QiJ_.0 Lte 
The residue in the upper half-plane of ze‘/2i(z? + c?) is 3xe—!¢'. Take 
z = r (cos 0 + isin 0) and integrate over the closed path C from —rtor 
along the x-axis and over the semicircle |z| = r in the upper half-plane. 
We need only prove the part of the integral over the semicircle tends 


to zero in the passage to the limit as r — oo. We find for the integral over 
the half circle 0 < 9 <7, 


n p2etbe—r sin 9 etr cos 0 
J= | 2&5 __* 
0 


Choose r so large that |r2e2#@ + c?|> 5 72: for example, choose r?> 2c?. 


It follows that 


12 I2 
|J |< 4 |" e-rsine dg < 4 |" e tren dg < "E, 
0 0 


Miscellaneous Exercises 8 (p. 818) 


1. (zı — 23)/(ze — 23) must be real. 
2. Let arg z be the argument of z = reo; that is, arg z = 0 + 2nr. The 


—> —> 
directed angle from the segment « to the segment «y is 


Yue 


arg -——— + 2pr, 
B— a 
where p is an integer. The given equation tells us that 
arg = = — arg Y—P 4 onr. 
B—«o a — p 


Thus, taking the segment joining « and ß as the base of the triangle, 
we see that the angles from the base to the sides are equal and opposite 
in sign. Conversely, equality of the base angles yields the given 
equation. 


_@- 23)/(z2 — 23) 
(zı — 24)/(z2 — 24) 


must be real, for if C is the circle through 21, 22, z3, we may transform 
C by a linear transformation ¢ = (az + 8)/(yz + ò) into the real axis 
(cf. Section 8.2, Exercise 8). By Section 8.2, Exercise 9, 4 is unchanged. 
Then a necessary condition that the image of z4 shall lie on the same 
circle as the images of 21, z2, z3 is that it be real, which is equivalent to 
A being real. 
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4. The equality to be proved is 


v|zı — z2||z3 — za] + V [ze — z3||zı — za| = Vl 21 — 2a| |z2 — z4 


1+ 


Now the expressions under the square roots are invariant in a linear 
transformation (cf. Section 8.2, Exercise 8, 9). If by a suitable linear 
transformation we transform the circle into the real axis, we have only 


to prove the relation AB» CD + BC-AD = AC •» BD for four points 
on a straight line, where it is trivial. 


5. Ç =e takes every value except ¢=0, as is easily seen from the 
relation e = e-¥(cos x + i sin x). Now we have to choose č so that 


or 


eae _ [eaa] (z1 — 23) (z2 — Za) 


(z2 — z3) (21 — 24) (ze — z3) (zı — z4) 


_ _1 1\. 
e=cosz=5(6+ 5); 


this quadratic equation always has a solution 
=c vye- i. 


and this solution is not zero, so that a corresponding z exists. 
6. Cf. Exercise 5. If ¢ = etz, then 


tan ge b SUD) _ 


i taD 


or 


there is a finite € + 0 only when c + + i; hence, tan z = c only has a 
solution if c is neither +i nor —i. 


7. If z = x + iy, cos z is real if x = mn or y = 0, and sin z = 0 if x = 
zn + 7/2 or y = 0 (where n is an integer). 

8. (a) r = 1 (for |z| > 1 the individual terms tend to œ; for |z|< 1 com- 

pare with the geometric series). 

(b) r=0. 
(c) r=1. 

9 (a) Integrate e?#/(1 + z4) over upper semicircle: 

T2 ain . v2 2). 

— e sin 


4 g T COS -y 


(b) Integrate z?etz/(1 + z4) over upper semicircle: 
Tv2 "| cos v2 — sin 2). 
4 2 2 


(c) Integrate et?/(q? + z2) over upper semicircle: 
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Tt 
— d, 
2q e 
(d) Integrate x*-t/[(x + 1) (x + 2)] over a region bounded by a large cir- 
cle about the origin and slit along the positive real axis: 
n(2%-1 — 1) 
Sin ra ' 


10. (a) +2ri at z = 2nr, —2ri at z = (2n + 1)r. 


11. 


12. 


13. 


(b) +2ri at z = 2nr + 37/2, —2ri at z = 2nr + x2. 
(c) Usethefunctionalequation T(z) = T(z + y + D/z(z + 1) • .• e (z+ y); 


=D" 


2ri at z = —n. 


(d) 2ri at z = nri. 


|sinh (x + iy)|? = 


(e — eur =) (= — e-zutty 
2 


(cosh 2x — cos 2y) 


= = (cosh 2x — 1). 


_l 
2 
1 
2 
Integrate along the boundary of a square with sides x = + x(n + }) 
and y= + (n+ 3), where n is an integer. As n— œ, the integral 
tends to zero; hence, the sum of the residues tends to zero. 


Write 


cot rt __ cot nt 4 z cot tt. 
t—z t t(t — z)’ 

cot zt is bounded on the square Cn, and the integrals of (cot rt)/t over 

opposite sides of the square almost cancel one another; hence, 


lim cot mt dt — lim z cot me dt — 
neo Jon £—2 n=» JCy t(t — 2) 


If we put together residues of opposite poles, the sum of the residues 
converges and we obtain 
2x | 1 1 1 


cot ™ = — 2x2 T x2 tateogte::| 


(cf. Volume I, p. 602). 
1 


= |] — 2 — e o o + ¿”-1 n 
147l tt- + t +CD 


t e 
Hence, 
= — — — — eee + —. Ny 
log l+2z)=2 9+ 3 to tR 


where 


14. 


15. 
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= (1r [7 Æ 
Rn = (—1) > Tait 


If we take z = eto and the straight line from 0 to e‘® as path of inte- 
gration, we have, for e#® + —1. 


f a dt! < Lf tdt = —1— 

o 1+ ett = m Jo m(n + 1)’ 

where m denotes the minimum of |1 + e?%| for 0 < t < 1. Hence, if 
z = e!0 + —1, Rn tends to 0. 

If x + 0 and if C’ is a contour in the region in which f is regular and 
contains y but not 0, then, by p. 801, 


CATO f(t) it 
C 


dy” (y— a)”+! — Əri ’' Fa tym 


If we put a = y = vx, the latter integral becomes 


nof faa 
C 


2ri Jc’ (t? — x)r+i ` 


If we then substitute t? = qt, the integral becomes 


| Ral = 


n! f FO) dt 
ori C (T — x)”r+1 ?, 
where C is a contour containing x but not 0; the integral is equal to 
1 d” — 
o g 1 o 1)\, 
© (@)= Ž la e) 
now 
1 1 2v 1 | 2 |z| 
oe — — — < = 
(2v — 1) (2v)2 z[ yz+1 dy = | (2v — 1)7+:}| (2y — 1)i+z , 
and the series >) 1/(2Y — 1)!** is absolutely convergent for x > 0. 
1 1 1 2 2 2 
b 1 — 91-2 = — —— — e o è — — — — _ — — — eee 
(b) ( 21-2)E(z) Ito ta tet z F & 


= — 1 ied eee = 
= 1 z t3 ae t = f(z). 


i — — f(1) » lim ZZ 1 = FO — 
(c) lim (z — 1) Gz) = f1) lim 7 ziz = gay» 


where 
g(z) = 1 — 2), 
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Abel’s integral equation, 512 
Absolute value, 769 
Absolutely convergent, 771 
Acceleration, normal-, 214 
tangential-, 214 
-vector, 214 
Active interpretation of transformation, 
148 
Additivity for, -areas, 372 
-integrals, 93 
-masses, 387 
Admissibility for variational problem, 740 
Affine, -coordinates, 144 
-mapping, 148, 242 
-transformation, 179, 276 
Algebraic functions, 13, 229 
Alternating, -differential, forms, 307, 324 
-functions, 167, 170, 175 
Amplitude of complex number, 769 
Analytic, -extension, 814—818 
-function, 780, 791 
Anchor ring, 285 
Angle, -between curves, 234 
-between curves on surface, 285 
-between directions, 127—131 
-between surfaces, 239 
solid-, 619, 720 
Angular magnitude, 721 
Anticommutative law of multiplication, 
181 
Apparent magnitude, 721 
Approximation, linear-, 50 
polynomial-, 64 
successive-, 267 
Weierstrass theorem on, 81 
Arc tangent, power series, 777 
principal branch, 12 
Archimedes’-principle, 52, 607 
Area, 367—374, 515 
additivity for-, 372, 522 


basic properties, 519—523 
-derivative, 566 
inner-, 369, 517 
-law, 667 
of curved surface, 424, 428, 540 
-of hypersurface, 453, 460 
-of n-dimensional sphere, 455—458 
of polygon, 203 
-of spherical surface, 426 
outer-, 369, 517, 520 
-swept out by moving curves, 448—453 
-vector, 621 
Argument of complex number, 769 
Associative law, 132, 152 
Astroid, 298 
Averaging of function, 82 


Ball, 9 
Base of vectors, 143 
Beam, loaded, 675—678 
Bernoulli’s, -differential equation, 683, 690 
-numbers, 802 
Bessel function, 475 
Beta function, 508—511 
Binomial, coefficients, 510 
series, 801 —802 
Binormal vector, 216 
Bohr-Mollerup theorem, 499 
Bolzano-Weierstrass principle of the point 
of accumulation, 107 
Boundary, -of oriented region, 580 
-of set, 6, 8, 10 
-value problem, 719, 724 
Bounded sequence, 2 
Brachistochrone problem, 737, 751, 756 
Buoyancy, 607 


Cable, loaded, 672—675 
Calculus, -of errors, 52—53 
of variations, 737 
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Cardiod, 302 


Cartesian, coordinate system, 127, 146, 156 


product of sets, 117 
Catenary, 751, 768 
Catenoid, 287 
Cauchy-Riemann equations, 58, 288, 780, 
786 
Cauchy-Schwarz inequality, 129, 182, 343 
for integrals, 501 _ 
Cauchy’s, -convergence test, 3, 108 
-formula, 799 
-symbol, 28 
-theorem, 789, 803 
Caustic, 302 
Cell, 10 
Center of mass, 432 
Centroid, 432 
Chain rule of differentiation, 55 
Characteristic function of set, 526 
Circle of convergence, 773 
Circular disk, 5, 6 
Circulation, 572, 615 
Clairaut equation, 296, 708 
Closed, -set, 8 
-differential form, 314 
Closure of set, 9, 10, 11, 118 
Columns of matrix, 147 
Commutative law, 132 
Compact, -set, 86, 109 
-support, 492 
Comparison test, 772 
Complement of a set, 116, 118, 119 
Complementary minor, 189 
Components, -of set, 102 
-of vector, 122, 131, 143 
Compound, -functions, 53—55, 62—63 
pendulum, 436—438 
Cone, 59 
Confocal, -conics, 256 
-parabolas, 234, 701 
-quadrics, 287 
Conformal transformation, 256, 288, 785, 
786 
Conjugate, -functions, 803, 805 
number, 767, 777 
Connected, -region, 102 
simply-, 103 
surface, 579 
Connectivity, 358 
Conservation, -of energy, 656—658, 759 


of mass, 567, 571, 603 
Conservative field, 616, 657 
Constraint, 340 
Content, 369, 515—517 
Continuity, -and partial derivatives, 34 

equation, 571, 603 

modulus of-, 67 

-of integral with respect to a parameter, 

14, 464 

uniform-, 112 
Continuous, -deformation, 103 

-function, 17—22, 112—113 
Continuously differentiable, 42 
Contour integration, 807—814 
Convergence, absolute-, 771 

Cauchy’s intrinsic test for-, 3 

circle of-, 773 

-of improper integrals, 411 

of sequence, 2 

radius of-, 773, 802 

uniform-, 771 
Convex, set, 102, 103 

functions, 499—500 

hull, 739 
Coordinate(s), affine-, 144 

Cartesian-, 127, 146, 156 

-curves, 247 

curvilinear-, 246, 251 

cylindrical-, 250 

focal-, 256, 257 

general-, 249 

-lines on surface, 282 

-net, 243, 247 

parabolic-, 248 

polar-, 248 

right-handed-, 184 

spherical-, 249 
. surfaces, 250 

-transformation of, 246 

-vector, 129, 133, 143 
Cosines, law of, 71, 127 
Coulomb’s law, 445, 714 
Cramer’s rule, 163, 177 
Critical points, 326, 352 
Cross product of vectors, 181, 182 
Curl of a vector, 209, 313 
Curvature, center of-, 213, 214, 232 

-of curve, 213, 230, 232 

radius of-, 213, 232 

-vector, 213 


Curve(s), coordinate-, 247 
curvature of-, 213, 230, 232 
discriminant-, 293 
double points of-, 360 
envelope of-, 293 
evolute of-, 301 
family of-, 291—302 
-in implicit form, 230—237 
isolated point of-, 361 
length of-, 283 
multiple point of-, 236 
normal of-, 231 
parallel-, 365 
pedal-, 303 
polygonal-, 112 
sectionally smooth-, 88 
singular point of-, 236, 360 
space-, 282 
tangent of-, 212, 231 
tangential representation of-, 365 
torsion of-, 216 

Curvilinear coordinates, 246—251 

Cusp, 299, 361 

Cut-off function, 494 


Deformation, 244 
Degenerate transformation, 274 
Degree, -of freedom, 757 
-of mapping, 562 
-of polynomial, 13, 119 
Density, 386, 566 : 
Dependent, -functions, 272, 273, 684 
linearly-, 137, 684 
-variables, 11 
vectors, 137 
Derivative, -at boundary points, 27 
directional-, 43, 45, 206 
exterior-, 312 
Fréchet-, 268 
normal-, 557 
-of an implicit function, 223 
-of function of complex variable, 779 
-of mapping, 268 
-of vector, 212 
partial-, 27 
radial-, 45, 62 
Determinants, 160—202 
definition of-, 166—170 
expansion of-, 170, 187 
functional-, 253 
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geometrical interpretation of-, 180—187 


Gram-, 193 

Jacobian-, 253 

nth order-, 171 

-of matrix, 170 

matrix, 175 

of product, 172 

second order-, 161 

third order-, 16] 
Diagonal, -rule, 162 

-matrix, 177 
Diameter of set, 376, 523 
Difference, of function, 66 

of points, 125 
Differentiability , 40—42 

complex variable, 779 
Differential, exact-, 314 

-of function, 49—51 

-of higher order, 50 

-operator, 209, 684 

total-, 49, 50, 314, 322 
Differential equations, 654—734 

constant of integration for-, 699 


existence and uniqueness of solution of-, 


702—706 
fundamental theorem on linear-, 687 
homogeneous-, 688 
integral curves of-, 697 
integration of-, 656 
linear-, 680, 696 
non-homogeneous-, 691 
-of family of curves, 699—702 
-of first order, 678—682 
-of higher order, 683—690 
-of second order, 688 
ordinary-, 654—712 
partial-, 713—735 
-systems of, 709—710 
-with constant coefficients, 696, 699, 
812—814 

Differential form, alternating-, 307—324 
closed-, 314 
exterior-, 316 
integral of-, 589—601, 647—653 
linear-, 84 
non-alternating-, 308 
quadratic-, 283 

Differentiation area-, 565 
change of order of-, 36—39 
-for inverse functions, 252 
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-to fractional order, 511—512 
under the integral sign, 74—80, 466—468 

Dipole, 717 

Dirac function, 674 

Direction, -cosines, 129 
-numbers, 130 

Directional derivative, 44 

Dirichlet’s discontinuous factor, 479 

Disconnected, 102 

Discontinuous, 18 

Discriminant, 304, 347 

Disjoint sets, 116 

Disk, 5, 6 

Distance, -from hyperplane, 135 
-from surface, 343 
-of points, 127, 146 

Distributive law, 132, 152, 165 

Div, 208 

Divergence, -of a vector, 208—210 
theorem, 549, 554, 637-642, 651 

Domain of a function, 11, 12 

Double, -integral, 80, 374—386 
-integral over oriented region, 589—592 
-layer, 717, 719, 720 

Doublet, 717 


Element of matrix, 147 
-of area, 425, 628 
Elementary surface, 624—627, 645—647 
Ellipsoid, 240 
greatest axis of-, 345 
moment of inertia of-, 443 
momental-, 443 
volume of, 417, 462 
Elliptic integral, 78 
Energy, conservation of-, 656, 657, 759 
kinetic-, 656, 758 
potential-, 657 
Envelopes, 292—295, 303—306, 735 
Epicycloid, 302 
é-neighborhood, 1, 9 
Equilibrium, 659—663 
Equipotential surfaces, 715 
Errors, 52—53 
Eulerian integrals, 497—511 
Euler’s, -Beta function, 508 
-constant, 505 
-differential equation, 743, 748, 755, 761, 
766 
-partial differential equation for 


homogeneous functions, 120, 761 
-representations of motion, 363 
Even permutation, 170 
Evolute, 301—302 
Exp, 457 
Exact differential form, 84 
Exponential function, 782—785, 792, 793 
Extension of function, 20 
Exterior, -content, 517 
differential forms, 312—313, 321—324 
-Jordan measure, 517 
-normal, 580, 633 
-point, 7,9, 118 
Extremals, 755 
Extreme values, 325, 326, 333, 334, 336, 
345 


Families, of curves, 290, 291 
of surfaces, 291 
Fermat’s principle of least time, 740 
Field, direction-, 697 
gradient-, 352 
vector-, 204 
Final point of vector, 125 
Fixed point of mapping, 270, 359, 787 
Fluid flow, 602—605 
Flux, 597, 732 
Focal coordinates, 256, 611 
Folium of Descartes, 224, 238 
Force, electric-, 733 
field of-, 204 
flux of-, 597 
gravitational-, 207, 655 
magnetic-, 733 
surface-, 606 
Form(s), 13, 83, 84 
alternating-, 168, 169, 175 
bilinear-, 164, 165, 167, 168, 179 
differential-, 84, 283, 307—324 
linear-, 83, 163, 164 
multilinear-, 166, 169, 175 
quadratic-, 165, 347 
trilinear-, 165, 168 
Fourier, -integral, 476—496 
-integral theorem, 477, 481, 485, 491 
-transform, 478, 491 
Fréchet derivative, 268 
Free surface, 606 
Freely falling particle, 658 
Frenet’s formulae, 216 


Fresnel’s integrals, 473 
Function(s), 11, 19 
algebraic-, 13, 229 
alternating-, 167—170 
analytic-, 780, 791 
characteristic-, 526 
compound-, 54, 55, 62 
continuous-, 17, 18, 19, 20, 112 
conjugate-, 803, 805 
convex-, 499 
cut-off-, 494 
dependent-, 273—275, 684 
differentiable-, 41,42, 45 
domain of-, 11, 12, 16, 17 
extreme values of-, 333 
geometric representation of-, 13—15 
harmonic-, 719 
Holder-continuous-, 19 
implicit-, 218—230 
independent-, 274 
inverse-, 252 
limit of-, 19 
Lipschitz-continuous-, 19 
many valued-, 814 
-of class C’, 42 
-of compact support, 492 
-of functions, 53 
potential-, 719, 803, 805 
rational-, 18 
rational integral-, 12 
support-, 365 
transcendental-, 229 
uniformly continuous-, 18 
variation of-, 742 
Functional, 740 
Functional equation of gamma function, 
498 
Fundamental quantities of surface, 283 
Fundamental system of solutions, 688 
Fundamental theorem, -of algebra, 806 
-on integrability of linear differential 
forms, 95, 104, 616 
-on linear dependence, 138, 158 


Gamma function, 497—508, 818 
Gauss, divergence theorem, 544, 597—610, 
637—642, 651 
-infinite product, 506 
Gaussian fundamental quantities of surface, 
283 
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Geodesics, 739, 757, 765 
Geometric series, 771 
Global, 222 
Grad, 206 
Gradient, -field, 352 
-vector, 206, 207, 210, 231 
Gram determinant, 193, 194 
Gravitational, -constant, 207, 655 
-field of force, 207, 655 
-potential, 439 
-vector field, 622 
Green’s, 543 
-integral theorems, 556—558, 607—608 
Guldin’s rule, 429, 452 


Half-spaces, 135 
Hamilton’s principle, 757, 758 
Heine-Borel covering theorem, 109—110, 
119 
Helix, 92, 767 
Hemisphere, 14, 279 
Hermite polynomials, 71 
Heron’s formula, 341 
Higher order of vanishing, 22 
Holder, -condition, 19 
-continuous, 19 
-inequality, 343 
Holomorphic, 780 
Homogeneous, -differential equations, 684, 
688 
-fluid, 604 
-functions, 119—121, 124 
-linear system of equations, 138—140 
-medium, 571 
-polynomials, 13, 119 
positively-, 120 
Homotopic, 103 
Huyghens’ theorem, 435 
Hyperbolic paraboloid, 14 
Hyperboloid, 280, 287 
Hyperplanes, 133—135, 201 
Hypersurface, 453, 460 


Identities, 252 

Identity, mapping, 126, 153 
transformation, 63 

Imaginary part, 769 

Implicit, -function theorem, 221, 228, 265 
-functions, 218—230, 261, 265 
-representation, 231, 238 
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Improper integrals, 407—416, 462—468 
differentiation of-, 467 
integration of-, 467 
Inclination, 249, 353 
Incompressible fluid, 571, 604, 617 
Increment, 83 
Indefinite quadratic form, 346 
Independent, 139 
-functions, 274 
-variables, 11, 60 
-vectors, 137 
Index of closed curve, 352, 355 
Inflection point, 231, 232 
Initial point of vectors, 125 
Inner area, 517 
Integrability conditions for differential, 84, 
98, 314 
Integrability of continuous functions, 526 
Integrable, 407, 525—528 
Integral(s), -curves, 699 
double-, 374—385 
-estimates, 383—385 
Eulerian-, 497 
Fourier-, 476 
Fresnel’s, 473 
-identities in higher dimensions, 622 
improper-, 406—416, 462—468 
law of additivity for-, 383, 529 
Lebesgue-, 407 
line-, 82—106 
multiple-, 367, 388, 531 
-of analytic function, 788 
-of continuous functions, 526 
-of differential forms, 589—597, 634, o4 i 
647—653 
-of functions of several variables, 524— 
525 
-over an elementary surface, 627 
-over regions in more dimensions, 385 
-over sets, 526 
-over simple surfaces, 594—597 
over unbounded regions, 414—416 
reduction of double-, 392 
repeated-, 78 
Riemann-, 89, 407 
transformation of multiple-, 539, 562 
Integration, 78, 80, 515, 656 
-constant, 699 
-of analytic functions, 787—789 
-of rational functions, 809 


-of total differentials, 95 
-to fractional order, 511 
Interchange of, -differentiations, 36—39 
-integrations, 80 
Interior, -content, 517 
-normal, 580 
-of set, 8 
-points, 6, 7, 8,9, 118 
Interval, 10 
Intrinsic convergence test, 3 
Invariant, 317 
Inverse, -functions, 252, 786 
-image, 242 
-mapping, 154, 242, 266 
-transformation, 261 
Inversion, 243, 244, 256, 277, 787 
Irrotational motion, 572, 616 
Isoperimetric, -inequality, 365—366 
problem, 739, 767 
-subsidiary conditions, 765 
Iteration, 267, 703 


Jacobian, -determinant, 253, 254 
-matrix, 268, 272 
-of product of two transformations, 258, 
276 
Jordan, -measure, 367—370, 515, 517 
-measurable set, 517, 628 


Kepler’s, -equation, 671 
-laws, 665, 667, 669, 671 

Kinetic energy, 656, 758 
-of rotating body, 435 
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Lagrange’s, equations, 759 
-multiplier, 332, 762—768 
-representation of motion, 363 
Laplace, equation, 58, 62, 573, 617, 713, 
724, 762 
-operator, 211, 608 
-operator in polar coorindates, 62 
-operator in spherical coordinates, 610 
Laplacian, 62, 211 
Latitude, 249 
Lebesgue, -area, 371 
-integral, 407 
-measure, 515 
Left-handed screws, 185 
Legendre’s condition, 747, 768 
Lemniscate, 223, 236, 238 


Length, -of arc on surface, 283 
-of vector, 146, 157 
Level line, 14, 207, 233 
Limit, 9, 19, 21 
-for complex variable, 770, 774 
of function, 19, 21 
-of sequence, 2,9, 21 
Line, contour-, 14, 233 
element, 283 
level-, 14, 207, 233 
parametric representation of-, 131 
vector representation for-, 130 
Line integrals, 85—91 
additivity of-, 93 
-independent of the path, 96, 104 
Linear, -approximation, 50 
-dependence, 137, 684 
-equations, 137, 138, 175—177 
-homogeneous function, 124 
-differential form, 84, 93, 95 
manifolds, 134, 144—146 
mappings, 150 
operations, 123 
transformations, 202, 778 
Lines of force, 597 
Lipschitz, -condition, 19 
-constant, 19 
-continuous, 19, 35, 67 
Lissajous figures, 665 
Local, 222 
Logarithm, 792—794 
Longitude, 249 
Lower, integral, 525 
-limit, 541 
-point of accumulation, 542 


Main diagonal of matrix, 157 
Manifold, 317, 543 
abstract-, 653 


linear-, 134, 144-146, 195, 198—200 


vector-, 204 
Mapping(s), 11, 242 
affine-, 148, 242 
-by reciprocal radii, 243 
degree of-, 561—565 
fixed point of-, 270, 359, 787 
identity-, 126, 153 
inverse-, 242, 266 
linear-, 150 
-of directions, 259 
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-of sets, 11,534 

-of vectors, 148 

open-, 535 

primitive-, 264 

resultant-, 257 

symbolic product of -, 152, 257 


Mass, center of-, 432 


conservation of-, 571, 603 
moment of-, 431 
total-, 387 


Matrices, 147 


addition of-, 151 

columns of-, 147 
determinants of-, 170 
diagonal-, 177 

elements of-, 147 
Jacobian-, 268, 272 

main diagonal of-, 151 
minor of-, 189 
multiplication of-, 151 
nonsingular-, 150, 155, 175 
operations with-, 150, 153 
orthogonal-, 156, 175 
product of-, 151—153, 172 
reciprocal-, 153, 154, 155 
rectangular-, 150, 153 
rows of-, 147 

singular-, 150, 155, 175 
square-, 150, 153 
transpose-, 157, 173 

unit-, 153, 154, 177 

upper triangular-, 178 
zero-, 153 


Maximum, absolute-, 325 


-of continuous function, 112 
relative-, 325, 347, 349 
strict-, 325 

value-, 327 


-with subsidiary conditions, 330—334 
Maxwell’s equations, 731—734 
Mean, arithmetic-, 341 


-density , 387 
geometric-, 341 


Mean value theorem, -for functions, 67 


-for potential functions, 722 


Minimal surfaces, 762 
Minimum, -of continuous function, 112 


relative-, 325, 347—349 
strict-, 325 


-with subsidiary conditions, 330—334 
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Minor of a matrix, 189 
Mobius band, 582, 589 
Modulus, -of complex number, 769 
of continuity, 18, 19, 67 
-of elasticity, 675 
Moment, -of dipole, 717 
-of inertia, 433—435 
-of inertia of ellipsoid, 443 
-of mass distribution, 431—432 
of momentum, 666 
-of velocity, 666 
Momental ellipsoid, 443 
Momentum, 602, 655 
Monomial, 13 
Morera’s theorem, 803 
Motion, equations of-, 654—656 
planetary-, 665—671 
Multiplier, 334—340, 762—768 


N-dimensional, ball, 459 
-Euclidean space RN, 10, 124 
sphere, 455 
-surface, 645, 648 
-vector space, 143 
Negative definite quadratic form, 346 
Neighborhood, 1, 9 
Newton’s, -law of attraction, 204, 665 
-second law, 654 


Non-homogeneous differential equation, 684 


Non-overlapping sets, 368 
Non-singular matrix, 150, 155, 175 
Non-trivial solution, 138, 140 
Normal, -acceleration, 214 
-derivative, 557 
-distance, 448 
exterior-, 580 
hyperplane, 135 
outward-drawn-, 599 
positive-, 593 
-to curve, 230—231 
-to hyperplane, 134—135 
-to surface, 238, 283, 284 
-velocity , 448 


Odd permutation, 170 

One sided surface, 582 

Open, -mapping, 535 
-set, 8 

Orders of magnitude, 22 

Orientability, 583 


Orientation, continuously varying-, 578, 586 


-of curves on surfaces, 587 
-of hyperplanes, 200, 201 
of parallel-epiped, 186, 195, 198, 199 
-of parallel-ogram, 180 
-of planes, 200, 201 
opposite-, 86, 185, 196 
standard-, 196 
-transformed, 260 
Oriented, area, 91 
-boundary, 580 
-hyperplanes, 201 
-linear manifold, 200 
-parallellepiped, 194, 195 
-simple closed curve, 86, 91 
-surface, 578, 580, 629, 633 
-tangent plane, 577 
Orthogonal, -curves, 234 
-matrices, 156, 158, 175 
-trajectories, 701, 707 
-transformations, 157 
-vectors, 133 
Orthogonality relations, 145, 146 
Orthonormal, -base, 145 
-system of vectors, 145, 156, 158 
Oscillations, 661—665 
Osculating plane, 215 
Outer area, 517, 520 
Overlapping, 368 


Parabolas, coaxial-, 244 
confocal-, 234, 244, 248 
Parabolic coordinates, 248 
Paraboloid, hyperbolic-, 14 
-of revolution, 14 
Parallel curves, 365 
Parallel displacements, 124 
Parallelepiped, orientation of, 186, 195, 
198, 199 
rectangular-, 10, 12 
-spanned by vectors, 186, 191 


volume of-, 187, 191, 193, 194, 195, 197 


Parallelogram, area of-, 182, 184, 190, 191 
orientation of-, 180 

Parametric representation, -of arc, 86 
-of line, 131 
-of surface, 278, 576 

Parseval’s identity for Fourier transforms, 

488, 496 
Partial, 27, 29, 34 


-derivative, 26—30 
-differential equation, 713—736 
-sums, 771 
Partition of unity, 635, 636 
Passive interpretation of transformation, 148 
Paths, 102 
family of-, 103, 105 
homotopic-, 103 
-of rays of light, 740 
support of-, 111 
Pathwise simply connected, 102 
Pendulum, 436—438 
Permutation, 170 
even-, 170 
odd-, 170 
Perpendicular, -distance, 192 
-vectors, 133 
Plane, osculating-, 215, 216 
perpendicular distance from-, 192 
tangent-, 239 
-waves, 490, 729 
Planetary motion, 665—671 
Planimeter, 453 
Plateau’s problem, 762 
Poincaré, -identity, 358 
-index, 353 
-lemma, 313 
Point, boundary-, 6, 7 
critical-, 326, 352 
double-, 360 
exterior-, 6, 7, 8, 118 
fixed, 787 
-in n-dimensional space, 10 
interior-, 6,7, 8,118 
isolated-, 361 
-of inflection, 231, 232 
rational-, 370 
saddle-, 327, 347 
sequences of-, 2 
singular-, 360, 362 
stationary-, 326 
Poisson’s integral formula, 724—726 
Polar, -coordinates, 61 
-planimeter, 453 
-reciprocal, 303 
Pole of analytic function, 805 
Polygonal curve, 112 
Polygonally connected, 68 
Polynomial(s), 13, 18 
Hermite-, 71 
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Taylor-, 64 
trigonometric-, 124 
Position vector, 126 
Positive, -definite quadratic form, 346 
-normal of surface, 579, 593 
-side of oriented surface, 579 
-side of plane, 201 
Postiively homogeneous, 120 
Potential, -due to a spherical surface, 441, 
716 
-energy, 439, 657, 758 
equation, 62, 211, 718—726 
-functions, 719, 722, 802, 805 
-of attracting charges, 714 
-of ellipsoid of revolution, 444 
-of forces, 657, 661 
-of solid sphere, 716 
-of straight line, 716—719 
-of uniform double layer, 720 
Power series, 772—777, 799—802 
Pressure, 605 
Primitive, -mappings, 264 
-nth root, 11, 821 
-transformation, 264 
Principal, -branch of arc tangent, 12 
-normal, 213, 265 
-value of logarithm, 794—802 
Product, cross-, 181 
of differential forms, 311—312, 321 
-of mappings, 257 
-of matrices, 152 
scalar-, 131—133 
symbolic-, 152, 257 
vector-, 181, 182, 187 


Quadratures, 679 

Quadratic form, discriminant of-, 347 
indefinite-, 346 
negative definite-, 346 
positive definite-, 346 

Quadratic, 179 


Radius of convergence, 773, 802 
Rational, -functions, 809 

-integral function, 12 

-points, 370 
Reaction forces, 215, 659 
Real part, 769 
Reciprocal matrix, 153, 154, 155 
Reflection with respect to unit circle, 243 
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Region, connected-, 4, 102 

rectangular-, 7, 10 

simply connected-, 4, 102—104 
Relative, -boundary, 648 

-closure, 648 

-error, 53 

-extremum, 326, 349 

-maximum, 325, 347—349 

-minimum, 325, 347—349 
Relatively open, 648 
Remainder in Taylor expansion, 69 
Repeated integration, 78 
Residue, -at point, 805 

-theorem, 805 
Restriction of function, 12 
Resultant, -mapping, 257 

-transformation, 257 
Riccati’s differential equation, 690, 691 
Riemann, -integrable, 407, 525 

-integral, 89, 407 

-sum, 89, 525, 530 

-zeta function, 797, 820 
Riemann-Lebesgue lemma, 481 
Right handed screws, 185 
Rigid motions, 157, 202 
Rolle’s theorem, 352 
Rotation, clockwise-, 200 

counterclockwise-, 200 

-of axes, 61, 202 

sense of-, 200 
Rows of matrix, 147 


Saddle point, 347 
Saddle-shaped, 15 
Sag, -of beam, 675 
-of cable, 672 
Scalar, 123, 205, 318 
gradient of a-, 205—208, 210 
-multiplication of matrices, 151 
-products of vectors, 131—133, 157 
Sectionally smooth, 5, 88 
Semi-continuity, 542 
Sense, -of curves, 357 
-of rotation, 200 
of vectors, 185 
Sequence, bounded-, 2 
convergence of-, 2 
limit of-, 2, 9, 21 
lower limit of-, 541 
-of complex numbers, 770 


-of points, 2 
Sequentially compact, 109 
Separation of variables, 678 
Series, 770 
Set, boundary of-, 10, 118 
closed-, 8, 109 
closure of-, 10, 118 
compact-, 109 
complement of-, 116, 118, 119 
connected-, 102 
diameter of-, 376, 523 
disjoint-, 116 
empty-, 114 
null-, 114 
open, 8, 109 
simply connected-, 102, 103 
Sets, Cartesian product of-, 115 
disjoint-, 116 
family of-, 113 
intersection of-, 115—117 
Jordan-measurable-, 517 
non-overlapping-, 368 
Shell, spherical, 580 
Shortest line joining two points, 764 
Simple, -arc, 86 
-surface, 631—634, 648 
Simplex, 462 
Simply connected sets, 102—103 
Singular, -matrix, 150, 155, 175 
-points of curves, 236, 360—362 
surfaces, 362—363 
-solutions, 701 
Singularity of analytic function, 804 
Sink, 574 
Slope of surface, 27 
Smoothing of function, 81 
Solid angle, 619 
Solutions, nontrivial-, 138 
trivial-, 138, 140 
-system of fundamental, 687, 688 
Solvability of system of linear equations, 
150 
Source of mass, 574 
Space differentiation, 387 
Spanned by vectors, 144 
Speed of propagation, 491 
Spherical, -coordinates, 404 
-law of cosines, 71 
-pendulum, 663 
-shell, 580 


Square matrices, 150 
Stability of equilibrium, 653—659 
Statics, principles of-, 618 
Stationary, -character, 737 
-point, 345, 351, 742 
-values, 331, 349, 754 
Steady flow, 573 
Stereographic projection, 280, 290 
Stokes’, -integral theorem, 554, 555, 572, 
611—617, 642, 643 
-formula in higher dimensions, 624, 651 — 
653 
Straight line, parametric representation of-, 
131 
vector representation of-, 131 
String, plucked-, 735 
vibrations of-, 727 
Strophoids, 300 
Subadditivity of outer areas, 520 
Subset, 114 
Subsidiary conditions, 330—336, 762—767 
Successive approximation, 266, 703 
Sum(s), lower-, 376, 524 
-of vectors, 125 
Riemann-, 89, 525, 530 
upper-, 376, 524 
Superposition, principle of-, 683—684 
Support, compact-, 492 
-function, 365 
-of path, 111 
Surface, -areas in any number of dimen- 
sions, 453—455 
area of-, 424, 428 
area of spherical-, 426, 458 
connected-, 579 
coordinate lines on-, 282 
elementary-, 624—625, 632, 645—647 
equipotential-, 715 
-forces, 606 
free-, 606 
geodesics on-, 739, 757, 765 
implicit representation of-, 238—240 
in parametric representation, 278, 576 
-integrals, 624, 645—653, 594—597 
isobaric-, 606 
m-dimensional-, 645, 648 
minimal-, 762 
-normal, 239, 283, 284 
of revolution, 50, 429 
one sided-, 582 
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orientation of-, 575—588 
oriented-, 578, 580, 629, 633 
simple-, 631—634, 648 
tangent plane to-, 282 
Symbolic product, -of mappings, 125, 152, 
257 
-of operators, 29 
System, -of functions, 241 
-of linear equations, 137, 138, 175—177 
-of mappings, 241 
-of transformations, 241 
orthonormal-, 145, 156, 158 


Tangent, -line, 231 
-plane, 47, 239, 282 
Tangential representation of curve, 365 
Taylor’s, expansion, 65, 64—66 
-series, 68—70, 776, 801 
-theorem, 68—70 
Tetrahedron, 141, 142 
Torus, 102, 285, 286, 589 
Total differentials, integration of-, 95—98 
-of functions, 49—51, 97, 104 
Transcendental functions, 229 
Transformations, affine-, 179, 276 
conformal-, 256, 288, 785 
degenerate-, 274 
inversion of, 261 
-of coordinates, 246 
primitive-, 264 
product of two-, 257 
resultant-, 257 
Translations, 124 
Transpose of matrix, 157 
Trigonometric polynomial, 124 
Triangle inequality, 769, 770 
Trivial solution, 138, 140 
Tube surface, 306 
Twisted curve, 282 


Undetermined, -coefficients, 711, 712 
-multipliers, 334—340, 762—768 

Uniform, -convergence, 464—771 
-approximations, 81 

Uniformly continuous, 18, 112 

Unit matrix, 153, 154, 177 

Unstable equilibrium, 663 

Upper integral, 525 

Upper-triangular matrix, 178 
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Variation, first-, 741—743 triple product of-, 181 
-of function, 742, 754 unit-, 130 
-of parameters, 681, 691—694 vector product of-, 181, 182, 187, 188, 311 
Vectors, acceleration-, 214 Zero-, 123,-129 
as differences of points, 125 Velocity, -of light, 741 
base of-, 143 -potential, 617 
binormal-, 216 -vector, 214 
component of-, 122, 131 Vibrations, -forced, 695 
coordinate-, 123, 129, 133, 143 -of a string, 727 
cross product of-, 180, 181, 182 Volume, 146, 374, 419 
curl of-, 209, 313 -in any number of dimensions, 453 
curvature-, 213 -of ellipsoid, 417, 418, 462 
definitions of-, 122, 123 of n-dimensional ball, 459 
divergence of-, 208, 210 -of parallelepipeds, 190—195, 201, 202 
electric-, 731 -of pyramid, 418 
families of-, 211, 212 -of region bounded by surface, 600 
fields of-, 204, 208, 211 Vortex, 575 
geometric representation of-, 124—127 Vorticity, 572, 616 
gradient-, 206, 207, 210, 231 
inclination of-, 353 Wallis’s product, 469 
length of-, 127, 146, 157 Wave, -equation in one dimension, 727—728 
linear dependence of-, 136, 141 -equation in three dimensions, 728, 729, 
linear forms of-, 163 733, 735, 736 
magnetic-, 731 -fronts, 448, 490, 491 
-manifold, 204 plane-, 490 
mapping of-, 148, 153 spherical-, 730 
multilinear forms of-, 163—170 traveling-, 728 
opposite-, 126 Weierstrass’, -approximation theorem, 81 
orthogonal-, 133 -infinite product, 506 
orthonormal-, 145, 156 -principle of the point of accumulation, 107 
perpendicular-, 133 Winding number, 100, 564 
position-, 126, 127, 212 Work, 616, 657 
principal normal-, 213 Wronskian, 686 
-product, 180, 188 Wronski’s condition, 688 
-representation for lines, 130 
scalar products of-, 131—133, 146, 157 Zero, -matrix, 153 
spaces of-, 123, 142, 143 -vector, 123, 129 
spanned by-, 144, 182 Zeros, number of-, 806 
sum of-, 122, 125 -of analytic function, 803 


Zeta function, 797, 820 


